frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Use Claude Code to Query 600 GB Indexes over Hacker News, ArXiv, etc.

https://exopriors.com/scry
59•Xyra•3h ago
Paste in my prompt to Claude Code with an embedded API key for accessing my public readonly SQL+vector database, and you have a state-of-the-art research tool over Hacker News, arXiv, LessWrong, and dozens of other high-quality public commons sites. Claude whips up the monster SQL queries that safely run on my machine, to answer your most nuanced questions.

There's also an Alerts functionality, where you can just ask Claude to submit a SQL query as an alert, and you'll be emailed when the ultra nuanced criteria is met (and the output changes). Like I want to know when somebody posts about "estrogen" in a psychoactive context, or enough biology metaphors when talking about building infrastructure.

Currently have embedded: posts: 1.4M / 4.6M comments: 15.6M / 38M That's with Voyage-3.5-lite. And you can do amazing compositional vector search, like search @FTX_crisis - (@guilt_tone - @guilt_topic) to find writing that was about the FTX crisis and distinctly without guilty tones, but that can mention "guilt".

I can embed everything and all the other sources for cheap, I just literally don't have the money.

Comments

bugglebeetle•1h ago
Seems very cool, but IMO you’d be better off doing an open source version and then hosted SAAS.
7777777phil•1h ago
Really useful currently working on a autonomous academic research system [1] and thinking about integrating this. Currently using custom prompt + Edison Scientific API. Any plans of making this open source?

[1] https://github.com/giatenica/gia-agentic-short

barishnamazov•1h ago
I like that this relies on generating SQL rather than just being a black-box chat bot. It feels like the right way to use LLMs for research: as a translator from natural language to a rigid query language, rather than as the database itself. Very cool project!

Hopefully your API doesn't get exploited and you are doing timeouts/sandboxing -- it'd be easy to do a massive join on this.

I also have a question mostly stemming from me being not knowledgeable in the area -- have you noticed any semantic bleeding when research is done between your datasets? e.g., "optimization" probably means different things under ArXiv, LessWrong, and HN. Wondering if vector searches account for this given a more specific question.

keeeba•12m ago
I don’t have the experiments to prove this, but from my experience it’s highly variable between embedding models.

Larger, more capable embedding models are better able to separate the different uses of a given word in the embedding space, smaller models are not.

nineteen999•1h ago
That's just not a good use of my Claude plan. If you can make it so a self-hosted Lllama or Qwen 7B can query it, then that's something.
mcintyre1994•11m ago
I think that’s just a matter of their capabilities, rather than anything specific to this?
mentalgear•1h ago
Nice, but would you consider open-sourcing it? I (and I assume others) are not keen on sharing my API keys with a 3rd party.
nielsole•8m ago
I think you misunderstood. The API key is for their API, not Anthropic.

If you take a look at the prompt you'll find that they have a static API key that they have created for this demo ("exopriors_public_readonly_v1_2025")

gtsnexp•1h ago
Is the appeal of this tool its ability to identify semantic similarity?
octoberfranklin•1h ago
"Claude Code and Codex are essentially AGI at this point"

Okaaaaaaay....

Hamuko•40m ago
I have noticed that Claude users seem to be about as intelligent as Claude itself, and wouldn't be able to surpass its output.
phatfish•14m ago
I want to know what the "intelligence explosion" is, sounds much cooler than AGI.
kburman•22m ago
> a state-of-the-art research tool over Hacker News, arXiv, LessWrong, and dozens

what makes this state of the art?

nandomrumber•1m ago
The tool is state of the art, the sources are historical.
ashirviskas•56s ago
First, so best in this?

Show HN: Use Claude Code to Query 600 GB Indexes over Hacker News, ArXiv, etc.

https://exopriors.com/scry
59•Xyra•3h ago•15 comments

Show HN: 22 GB of Hacker News in SQLite

https://hackerbook.dosaygo.com
547•keepamovin•18h ago•170 comments

Show HN: I built a universal clipboard that syncs realtime on multiple devices

https://www.quickclip.space/
2•imgopaal•20m ago•0 comments

Show HN: One clean, developer-focused page for every Unicode symbol

https://fontgenerator.design/symbols
175•yarlinghe•5d ago•75 comments

Show HN: Brainrot Translator – Convert corporate speak to Gen Alpha and back

https://brainrottranslator.com
25•todaycompanies•18h ago•6 comments

Show HN: Tidy Baby is a SET game but with words

https://tidy.baby
30•brgross•19h ago•6 comments

Show HN: I remade my website in the Sith Lord Theme and I hope it's fun

https://cookie.engineer/index.html
32•cookiengineer•16h ago•12 comments

Show HN: Isit2026yet.com – A single-serving site for the New Year

https://isit2026yet.com/
6•eamongordon•4h ago•2 comments

Show HN: RAMBnB.xyz P2P marketplace for RAM rentals

https://www.rambnb.xyz
22•olivierroy•11h ago•7 comments

Show HN: Perfetto2LLM - A tool to pass system traces to an LLM

https://perfetto-to-llm.vercel.app/
2•ak2242•5h ago•0 comments

Show HN: Stop Claude Code from forgetting everything

https://github.com/mutable-state-inc/ensue-skill
184•austinbaggio•1d ago•218 comments

Show HN: Replacing my OS process scheduler with an LLM

https://github.com/mprajyothreddy/brainkernel
16•ImPrajyoth•18h ago•9 comments

Show HN: LLMRouter – first LLM routing library with 300 stars in 24h

https://github.com/ulab-uiuc/LLMRouter
3•tao2024•6h ago•1 comments

Show HN: Aroma: Every TCP Proxy Is Detectable with RTT Fingerprinting

https://github.com/Sakura-sx/Aroma
80•Sakura-sx•5d ago•49 comments

Show HN: See what readers who loved your favorite book/author also loved to read

https://shepherd.com/bboy/2025
126•bwb•1d ago•39 comments

Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB

https://github.com/HarryR/z80ai
495•quesomaster9000•2d ago•117 comments

Show HN: My not-for-profit search engine with no ads, no AI, & all DDG bangs

https://nilch.org
195•UnmappedStack•2d ago•74 comments

Show HN: Claude Cognitive – Working memory for Claude Code

https://github.com/GMaN1911/claude-cognitive
5•MirrorEthic•12h ago•2 comments

Show HN: Client-side encrypted AI detector using model ensembling

https://veredictlabs.com
3•oscarzdev•8h ago•0 comments

Show HN: A Claude Code plugin that catch destructive Git and filesystem commands

https://github.com/kenryu42/claude-code-safety-net
56•kenryu•5d ago•64 comments

Show HN: Euclidle – Guess the Coordinates in N‑Dimensional Space

https://euclidle.com/
17•bills-appworks•5d ago•7 comments

Show HN: FuseCells – 2,500 handcrafted levels logic puzzle game with leaderboard

https://igodia.dev/fusecells
5•keini•11h ago•3 comments

Show HN: I built my own Metronome Desktop App

https://shredono.me/
2•danmol•11h ago•0 comments

Show HN: A dynamic key-value IP allowlist for Nginx

https://github.com/dayt0n/kvauth
2•dayt0n•12h ago•0 comments

Show HN: Tetris Time

https://tetris-time.koenvangilst.nl/?mode=countdown&to=2026-01-01T00:00:00.000Z&speed=3
11•vnglst•1d ago•3 comments

Show HN: Spacelist, a TUI for Aerospace window manager

https://github.com/magicmark/spacelist
41•markl42•4d ago•6 comments

Show HN: Per-instance TSP Solver with No Pre-training (1.66% gap on d1291)

18•jivaprime•1d ago•3 comments

Show HN: Vibe coding a bookshelf with Claude Code

https://balajmarius.com/writings/vibe-coding-a-bookshelf-with-claude-code/
277•balajmarius•1d ago•209 comments

Show HN: Slide notes visible only to you during screen sharing

https://cuecard.dev
2•thisisnsh•16h ago•0 comments

Show HN: Xcc700: Self-hosting mini C compiler for ESP32 (Xtensa) in 700 lines

https://github.com/valdanylchuk/xcc700
152•isitcontent•4d ago•36 comments