frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Using LLMs and >1k 4090s to visualize 100k scientific research articles

https://twitter.com/0xSamHogan/status/1980748729444659500
4•funfunfunction•1d ago

Comments

ashvardanian•1d ago
Congrats on the release, Sam - the preview looks great!

I'm curious about the technical side: how are you handling the dimensionality reduction and visualization? Also noticed you mentioned "custom-trained LLMs" in the tweet - how large are those models, and what motivated using custom ones instead of existing open models?

funfunfunction•1d ago
We'll release the full data explorer soon, with more info.

At the core of this project is a structured-extraction task using a custom Qwen 14B model, which we distilled from larger closed-source models. We needed a model we could run at scale on https://devnet.inference.net, which is comprised mostly of idle consumer-grade NVIDIA devices.

Embeddings were generated using SPECTER2, a transformer model from AllenAI specifically designed for scientific documents. The model processes each paper's title, executive summary, and research context to generate 768-dimensional embeddings optimized for semantic search over scientific literature.

The visualization uses UMAP to reduce the 768D embeddings to 3D coordinates, preserving local and global structure. K-Means clustering groups papers into ~100 clusters based on semantic similarity in the embedding space. Cluster labels are automatically generated using TF-IDF analysis of paper fields and key takeaways, identifying the most distinctive terms for each cluster.

Experimental 3.5T K2 merge beats GPT-4.5 and Opus at writing

https://huggingface.co/NobodyExistsOnTheInternet/K3-Q4-GGUF
1•bo0tzz•51s ago•0 comments

Self-Hosting with OpenSUSE MicroOS and Podman

https://www.lackhove.de/blog/selfhosting/
1•mnmalst•1m ago•0 comments

Atari Portfolio: Going Online Like It's 1989 – Auvik

https://www.auvik.com/franklyit/blog/atari-portfolio/
1•rbanffy•3m ago•0 comments

Buy this $15 USD hot sauce. Get a real $100 USD note and the hot sauce

https://burnrate.cash/
2•ghuntley•5m ago•0 comments

SVG in GTK

https://blogs.gnome.org/gtk/2025/10/23/svg-in-gtk/
1•JNRowe•7m ago•0 comments

Beyond the Machine: Creative agency in the AI landscape

https://frankchimero.com/blog/2025/beyond-the-machine/
1•yurivish•10m ago•0 comments

Bring Your 3D Models to Life

https://mesh2motion.org/
1•Splizard•19m ago•0 comments

Xmlbuilder2 – An XML Builder for Node.js

https://github.com/oozcitak/xmlbuilder2
1•javatuts•21m ago•0 comments

Identifying Life-Changing Books with LLMs (2024)

https://blog.joellehman.com/identifying-life-changing-books-with-llms.html
1•stared•21m ago•0 comments

How to Cut an Onion Optimally: A Love Letter to the Jacobian

https://www.tandfonline.com/eprint/AUGKH2EDWISRJTQUMKJJ/full?target=10.1080/0025570X.2025.2521248
1•mariuz•25m ago•0 comments

It's always DNS...

https://netwars.pelicancrossing.net/2025/10/24/its-always-dns/
1•ColinWright•26m ago•0 comments

Show HN: A cheaper background removal API (no subscriptions, pay-per-credit)

https://bgbuster.com/
1•tcogz•26m ago•0 comments

Poland's birth rate is in freefall. A loneliness epidemic that cash can't solve

https://www.theguardian.com/commentisfree/2025/oct/23/polands-birth-rate-is-in-freefall-the-cause...
3•dude250711•28m ago•1 comments

The debate: Are facial recognition cameras in Sainsbury's a step too far?

https://www.bbc.co.uk/news/resources/idt-bb93a137-9b73-498b-ad8f-f948d6071dee
1•c-oreills•31m ago•0 comments

Hi, I just made a simple&funny test to test your AI purity

https://aipuritytest.app
1•q534•39m ago•1 comments

Show HN: LinkdAPI, the best LinkedIn unofficial API

https://linkdapi.com/
4•LinkdAPI•42m ago•1 comments

Show HN: 15M Line Item That Doesnt Exist: Invisible Certificates ($0 Today Only)

2•dc352•43m ago•0 comments

Trenchant Boss Charged with Seeking to Sell Secrets in Russia

https://www.bloomberg.com/news/articles/2025-10-23/hacking-lab-boss-charged-with-seeking-to-sell-...
4•0rdinal•44m ago•1 comments

It's Already Getting Hacked... [video]

https://www.youtube.com/watch?v=1yJabMKRTU0
1•chii•46m ago•0 comments

Windows 10's demise could be Linux's gain

https://www.techradar.com/computing/windows/windows-10s-demise-could-be-linuxs-gain-if-the-flood-...
1•taubek•48m ago•0 comments

MODPOD: The collapse of IETF's protections for dissent

https://blog.cr.yp.to/20251005-modpod.html
3•gjvc•49m ago•0 comments

The Fly

https://poets.org/poem/fly
1•keepamovin•55m ago•0 comments

Deepinder Goyal launches $25M fund for human ageing research

https://www.continue.com/purpose
1•varun_chopra•56m ago•0 comments

When the VIBEs Start to Fade

https://www.mindruptive.com/blog-posts/AI/when-the-vibes-fade
1•estheryo•56m ago•0 comments

Show HN: I "invented" Model-as-a-Service for Predictable Private AI

https://pyrinas.co
2•jc_price•57m ago•0 comments

Show HN: Orbyt – Job Search Analytics from Your Inbox

https://github.com/abhijitxy/Orbyt
1•roya51788•1h ago•0 comments

Show HN: NanoPhoto AI – Next Generation Photo Editor

https://nanophotoeditor.com/
1•stjuan627•1h ago•0 comments

Twake Drive – The open-source alternative to Google Drive

https://github.com/linagora/twake-drive
31•javatuts•1h ago•9 comments

Rimac Founder Says He Is in Talks with Porsche on Bugatti Buyout

https://www.bloomberg.com/news/articles/2025-10-15/rimac-founder-says-he-is-in-talks-with-porsche...
1•breve•1h ago•0 comments

First verifiable quantum experiment by Google [video]

https://www.youtube.com/watch?v=mEBCQidaNTQ
1•fgfm•1h ago•0 comments