frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Use Claude Code to Query 600 GB Indexes over Hacker News, ArXiv, etc.

https://exopriors.com/scry
59•Xyra•3h ago
Paste in my prompt to Claude Code with an embedded API key for accessing my public readonly SQL+vector database, and you have a state-of-the-art research tool over Hacker News, arXiv, LessWrong, and dozens of other high-quality public commons sites. Claude whips up the monster SQL queries that safely run on my machine, to answer your most nuanced questions.

There's also an Alerts functionality, where you can just ask Claude to submit a SQL query as an alert, and you'll be emailed when the ultra nuanced criteria is met (and the output changes). Like I want to know when somebody posts about "estrogen" in a psychoactive context, or enough biology metaphors when talking about building infrastructure.

Currently have embedded: posts: 1.4M / 4.6M comments: 15.6M / 38M That's with Voyage-3.5-lite. And you can do amazing compositional vector search, like search @FTX_crisis - (@guilt_tone - @guilt_topic) to find writing that was about the FTX crisis and distinctly without guilty tones, but that can mention "guilt".

I can embed everything and all the other sources for cheap, I just literally don't have the money.

Comments

bugglebeetle•1h ago
Seems very cool, but IMO you’d be better off doing an open source version and then hosted SAAS.
7777777phil•1h ago
Really useful currently working on a autonomous academic research system [1] and thinking about integrating this. Currently using custom prompt + Edison Scientific API. Any plans of making this open source?

[1] https://github.com/giatenica/gia-agentic-short

barishnamazov•1h ago
I like that this relies on generating SQL rather than just being a black-box chat bot. It feels like the right way to use LLMs for research: as a translator from natural language to a rigid query language, rather than as the database itself. Very cool project!

Hopefully your API doesn't get exploited and you are doing timeouts/sandboxing -- it'd be easy to do a massive join on this.

I also have a question mostly stemming from me being not knowledgeable in the area -- have you noticed any semantic bleeding when research is done between your datasets? e.g., "optimization" probably means different things under ArXiv, LessWrong, and HN. Wondering if vector searches account for this given a more specific question.

keeeba•14m ago
I don’t have the experiments to prove this, but from my experience it’s highly variable between embedding models.

Larger, more capable embedding models are better able to separate the different uses of a given word in the embedding space, smaller models are not.

nineteen999•1h ago
That's just not a good use of my Claude plan. If you can make it so a self-hosted Lllama or Qwen 7B can query it, then that's something.
mcintyre1994•13m ago
I think that’s just a matter of their capabilities, rather than anything specific to this?
mentalgear•1h ago
Nice, but would you consider open-sourcing it? I (and I assume others) are not keen on sharing my API keys with a 3rd party.
nielsole•10m ago
I think you misunderstood. The API key is for their API, not Anthropic.

If you take a look at the prompt you'll find that they have a static API key that they have created for this demo ("exopriors_public_readonly_v1_2025")

gtsnexp•1h ago
Is the appeal of this tool its ability to identify semantic similarity?
octoberfranklin•1h ago
"Claude Code and Codex are essentially AGI at this point"

Okaaaaaaay....

Hamuko•42m ago
I have noticed that Claude users seem to be about as intelligent as Claude itself, and wouldn't be able to surpass its output.
phatfish•16m ago
I want to know what the "intelligence explosion" is, sounds much cooler than AGI.
kburman•24m ago
> a state-of-the-art research tool over Hacker News, arXiv, LessWrong, and dozens

what makes this state of the art?

nandomrumber•3m ago
The tool is state of the art, the sources are historical.
ashirviskas•3m ago
First, so best in this?

The rise of industrial software

https://chrisloy.dev/post/2025/12/30/the-rise-of-industrial-software
69•chrisloy•2h ago•43 comments

Show HN: Use Claude Code to Query 600 GB Indexes over Hacker News, ArXiv, etc.

https://exopriors.com/scry
59•Xyra•3h ago•16 comments

Akin's Laws of Spacecraft Design [pdf]

https://www.ece.uvic.ca/~elec399/201409/Akin%27s%20Laws%20of%20Spacecraft%20Design.pdf
10•tosh•57m ago•0 comments

Animated AI

https://animatedai.github.io/
195•frozenseven•5d ago•19 comments

A faster heart for F-Droid

https://f-droid.org/2025/12/30/a-faster-heart-for-f-droid.html
418•kasabali•16h ago•169 comments

Tixl: Open-source realtime motion graphics

https://github.com/tixl3d/tixl
49•nateb2022•4d ago•5 comments

Show HN: 22 GB of Hacker News in SQLite

https://hackerbook.dosaygo.com
548•keepamovin•18h ago•170 comments

FediMeteo: A €4 FreeBSD VPS Became a Global Weather Service

https://it-notes.dragas.net/2025/02/26/fedimeteo-how-a-tiny-freebsd-vps-became-a-global-weather-s...
314•birdculture•15h ago•74 comments

Readings in Database Systems (5th Edition) (2015)

http://www.redbook.io/
104•teleforce•9h ago•9 comments

Odin: Moving Towards a New "core:OS"

https://odin-lang.org/news/moving-towards-a-new-core-os/
63•ksec•5d ago•21 comments

'Three norths' alignment about to end

https://www.spatialsource.com.au/three-norths-alignment-about-to-end/
24•altilunium•6d ago•10 comments

Doom in Django: testing the limits of LiveView at 600.000 divs/segundo

https://en.andros.dev/blog/7b1b607b/doom-in-django-testing-the-limits-of-liveview-at-600000-divss...
17•andros•3d ago•6 comments

Honey's Dieselgate: Detecting and tricking testers

https://vptdigital.com/blog/honey-detecting-testers/
254•AkshatJ27•13h ago•84 comments

A Vulnerability in Libsodium

https://00f.net/2025/12/30/libsodium-vulnerability/
276•raggi•17h ago•36 comments

Loss32: Let's Build a Win32/Linux

https://loss32.org/
292•akka47•1d ago•380 comments

OpenAI's cash burn will be one of the big bubble questions of 2026

https://www.economist.com/leaders/2025/12/30/openais-cash-burn-will-be-one-of-the-big-bubble-ques...
366•1vuio0pswjnm7•13h ago•524 comments

Electrolysis can solve one of our biggest contamination problems

https://ethz.ch/en/news-and-events/eth-news/news/2025/11/electrolysis-can-solve-one-of-our-bigges...
162•PaulHoule•17h ago•46 comments

Non-Zero-Sum Games

https://nonzerosum.games/
384•8organicbits•23h ago•178 comments

Kitchen Optimizations

https://www.natemeyvis.com/kitchen-optimizations/
3•Theaetetus•6d ago•0 comments

Sabotaging Bitcoin

https://blog.dshr.org/2025/12/sabotaging-bitcoin.html
152•zdw•14h ago•133 comments

Mitsubishi Diatone D-160 (1985)

https://audio-database.com/MITSUBISHI-DIATONE/diatonesp/d-160-e.html
54•anigbrowl•2d ago•21 comments

No strcpy either

https://daniel.haxx.se/blog/2025/12/29/no-strcpy-either/
213•firesteelrain•21h ago•117 comments

Zpdf: PDF text extraction in Zig

https://github.com/Lulzx/zpdf
182•lulzx•15h ago•72 comments

Toro: Deploy Applications as Unikernels

https://github.com/torokernel/torokernel
133•ignoramous•18h ago•115 comments

Escaping containment: A security analysis of FreeBSD jails [video]

https://media.ccc.de/v/39c3-escaping-containment-a-security-analysis-of-freebsd-jails
97•todsacerdoti•15h ago•4 comments

Five Years of Tinygrad

https://geohot.github.io//blog/jekyll/update/2025/12/29/five-years-of-tinygrad.html
240•iyaja•1d ago•119 comments

The British empire's resilient subsea telegraph network

https://subseacables.blogspot.com/2025/12/the-british-empires-resilient-subsea.html
198•giuliomagnifico•21h ago•51 comments

Times New American: A Tale of Two Fonts

https://hsu.cy/2025/12/times-new-american/
262•firexcy•22h ago•149 comments

Professional software developers don't vibe, they control

https://arxiv.org/abs/2512.14012
174•dpflan•15h ago•201 comments

Google Opal

https://opal.google/landing/
141•gmays•7h ago•90 comments