frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

South Korean crypto firm accidentally sends $44B in bitcoins to users

https://www.reuters.com/world/asia-pacific/crypto-firm-accidentally-sends-44-billion-bitcoins-use...
1•layer8•40s ago•0 comments

Apache Poison Fountain

https://gist.github.com/jwakely/a511a5cab5eb36d088ecd1659fcee1d5
1•atomic128•2m ago•0 comments

Web.whatsapp.com appears to be having issues syncing and sending messages

http://web.whatsapp.com
1•sabujp•2m ago•1 comments

Google in Your Terminal

https://gogcli.sh/
1•johlo•4m ago•0 comments

Shannon: Claude Code for Pen Testing

https://github.com/KeygraphHQ/shannon
1•hendler•4m ago•0 comments

Anthropic: Latest Claude model finds more than 500 vulnerabilities

https://www.scworld.com/news/anthropic-latest-claude-model-finds-more-than-500-vulnerabilities
1•Bender•9m ago•0 comments

Brooklyn cemetery plans human composting option, stirring interest and debate

https://www.cbsnews.com/newyork/news/brooklyn-green-wood-cemetery-human-composting/
1•geox•9m ago•0 comments

Why the 'Strivers' Are Right

https://greyenlightenment.com/2026/02/03/the-strivers-were-right-all-along/
1•paulpauper•10m ago•0 comments

Brain Dumps as a Literary Form

https://davegriffith.substack.com/p/brain-dumps-as-a-literary-form
1•gmays•11m ago•0 comments

Agentic Coding and the Problem of Oracles

https://epkconsulting.substack.com/p/agentic-coding-and-the-problem-of
1•qingsworkshop•11m ago•0 comments

Malicious packages for dYdX cryptocurrency exchange empties user wallets

https://arstechnica.com/security/2026/02/malicious-packages-for-dydx-cryptocurrency-exchange-empt...
1•Bender•11m ago•0 comments

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

https://github.com/pheonix-delta/axiom-voice-agent
1•shubham-coder•12m ago•0 comments

Penisgate erupts at Olympics; scandal exposes risks of bulking your bulge

https://arstechnica.com/health/2026/02/penisgate-erupts-at-olympics-scandal-exposes-risks-of-bulk...
4•Bender•12m ago•0 comments

Arcan Explained: A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
1•fanf2•14m ago•0 comments

What did we learn from the AI Village in 2025?

https://theaidigest.org/village/blog/what-we-learned-2025
1•mrkO99•14m ago•0 comments

An open replacement for the IBM 3174 Establishment Controller

https://github.com/lowobservable/oec
1•bri3d•17m ago•0 comments

The P in PGP isn't for pain: encrypting emails in the browser

https://ckardaris.github.io/blog/2026/02/07/encrypted-email.html
2•ckardaris•19m ago•0 comments

Show HN: Mirror Parliament where users vote on top of politicians and draft laws

https://github.com/fokdelafons/lustra
1•fokdelafons•19m ago•1 comments

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

1•Chance-Device•21m ago•0 comments

We Mourn Our Craft

https://nolanlawson.com/2026/02/07/we-mourn-our-craft/
1•ColinWright•24m ago•0 comments

Jim Fan calls pixels the ultimate motor controller

https://robotsandstartups.substack.com/p/humanoids-platform-urdf-kitchen-nvidias
1•robotlaunch•27m ago•0 comments

Exploring a Modern SMTPE 2110 Broadcast Truck with My Dad

https://www.jeffgeerling.com/blog/2026/exploring-a-modern-smpte-2110-broadcast-truck-with-my-dad/
1•HotGarbage•27m ago•0 comments

AI UX Playground: Real-world examples of AI interaction design

https://www.aiuxplayground.com/
1•javiercr•28m ago•0 comments

The Field Guide to Design Futures

https://designfutures.guide/
1•andyjohnson0•29m ago•0 comments

The Other Leverage in Software and AI

https://tomtunguz.com/the-other-leverage-in-software-and-ai/
1•gmays•31m ago•0 comments

AUR malware scanner written in Rust

https://github.com/Sohimaster/traur
3•sohimaster•33m ago•1 comments

Free FFmpeg API [video]

https://www.youtube.com/watch?v=6RAuSVa4MLI
3•harshalone•33m ago•1 comments

Are AI agents ready for the workplace? A new benchmark raises doubts

https://techcrunch.com/2026/01/22/are-ai-agents-ready-for-the-workplace-a-new-benchmark-raises-do...
2•PaulHoule•38m ago•0 comments

Show HN: AI Watermark and Stego Scanner

https://ulrischa.github.io/AIWatermarkDetector/
1•ulrischa•38m ago•0 comments

Clarity vs. complexity: the invisible work of subtraction

https://www.alexscamp.com/p/clarity-vs-complexity-the-invisible
1•dovhyi•39m ago•0 comments
Open in hackernews

Ask HN: Are we pretending RAG is ready, when it's barely out of demo phase?

11•TXTOS•6mo ago
Been watching the RAG (Retrieval-Augmented Generation) wave crash into production for over a year now.

But something keeps bugging me: Most setups still feel like glorified notebooks stitched together with hope and vector search.

Yeah, it "works" — until you actually need it to. Suddenly: irrelevant chunks, hallucinations, shallow query rewriting, no memory loop, and a retrieval stack that breaks if you breathe on it wrong.

We’ve got: • pipelines that don’t align with what users actually want to ask, • retrieval that acts more like a search engine than a reasoning aid, • brittle evals (because "correct context" ≠ "correct answer"), • and no one’s sure where grounding ends and illusion begins.

Sure, you can make it work — if you’re okay duct-taping every component and babysitting the system 24/7.

So I gotta ask: Is RAG just stuck in prototype land pretending to be production? Or has someone here actually built a setup that survives user chaos and edge cases?

Would love to hear what’s worked, what hasn't, and what you had to throw away.

Not pushing anything, just been knee-deep in this and looking to sanity check with folks who’ve actually shipped stuff.

Comments

kingkongjaffa•6mo ago
We have a RAG powered product in production right now used by thousands of users.

RAG is part of the solution, it provides the required style, formatting and subject matter idiosyncrasies of the domain.

But it isn't enough to do (prompt + RAG query on that prompt) alone, we have a handwritten series of prompts, so the user input is just one step in a branching decision tree of deciding which prompts to apply, in sequence (prompt 1 output = prompt 2 input) and also composition (deciding to combine prompt (3 + 5, but not prompt 4)) for example.

TXTOS•6mo ago
Totally agree, RAG by itself isn’t enough — especially when users don’t follow the script.

We’ve seen similar pain: one-shot retrieval works great in perfect lab settings, then collapses once you let in real humans asking weird followups like

“do that again but with grandma’s style” and suddenly your context window looks like a Salvador Dali painting.

That branching tree approach you mentioned — composing prompt→prompt→query in a structured cascade — is underrated genius. We ended up building something similar, but layered a semantic engine on top to decide which prompt chain deserves to exist in that moment, not just statically prewiring them.

It’s duct tape + divination right now. But hey — the thing kinda works.

Appreciate your battle-tested insight — makes me feel slightly less insane.

mikert89•6mo ago
what planet are you on? RAG has been working everywhere for a while
TXTOS•6mo ago
haha fair — guess I’ve just been on the planet where the moment someone asks a followup like “can you explain that in simpler terms?”, the whole RAG stack folds like a house of cards.

if it’s been smooth for you, that’s awesome. I’ve just been chasing edge cases where users go off-script, or where prompt alignment + retrieval break in weird semantic corners.

so yeah, maybe it’s a timezone thing

moomoo11•6mo ago
You’re better off using OS/ES and using AI to help you craft the exact query to run on an index.
TXTOS•6mo ago
agree — I’ve used Q/S with AI-assisted query shaping too, especially when domain vocab gets wild. the part I kept bumping into was: even with perfect-looking queries, the retrieved context still lacked semantic intent alignment.

so I started layering reasoning before retrieval — like a semantic router that decides not just what to fetch, but why that logic path even makes sense for this user prompt.

different stack, same headache. appreciate your insight though — it’s a solid route when retrieval infra is strong.

moomoo11•6mo ago
What sort of data are you working with?

In my case, users would be searching through either custom defined data models (I have custom forms and stuff), or if they were trying to find a comment on a Task, or various other attached data on common entities.

For example, "When did Mark say that the field team ran into an issue with the groundwater swelling?"

That would return the Comment, tied to the Task.

In my system there are discussions and comments (common) tied to every data entity (and I'm using graphdb, which makes things exponentially simpler). I index all of these anyway for OS, so the AI is able to construct the query to find this. So I can go from comment -> top level entity or vice versa.

I spent maybe 60-100 hours writing dozens maybe a hundred tests to get the prompt right, taking it from 95% success to 100% success. In over 2 months it hasn't failed yet.

Sorry, I should mention maybe our use-cases are different. I am basically building an audit log.

TXTOS•6mo ago
ah yeah that makes sense — sounds like you're indexing for traceability first, which honestly makes your graph setup way more stable than most RAG stacks I’ve seen.

I’m more on the side of: “why is this even the logic path the system thinks makes sense for the user’s intent?” — like, how did we get from prompt to retrieval to that hallucination?

So I stopped treating retrieval as the answer. It’s just an echo. I started routing logic first — like a pre-retrieval dialectic, if you will. No index can help you if the question shouldn’t even be a question yet.

Your setup sounds tight though — we’re just solving different headaches. I’m more in the “why did the LLM go crazy” clinic. You’re in the “make the query land” ward.

Either way, I love that you built a graph audit log that hasn’t failed in two months. That's probably more production-ready than 90% of what people call “RAG solutions” now.

moomoo11•6mo ago
Thanks :)

And really cool stuff you’re doing too. Honestly I have not spent as much time as maybe I should really diving into all the LLM tooling and stuff like you have.

Good luck!!

TXTOS•6mo ago
hey — really appreciate that. honestly I’m still duct-taping this whole system together half the time, but glad it’s useful enough to sound like “tooling”

I think the whole LLM space is still missing a core idea: that logic routing before retrieval might be more important than retrieval itself. when the LLM “hallucinates,” it’s not always because it lacked facts — sometimes it just followed a bad question.

but yeah — if any part of this helps or sparks new stuff, then we’re already winning. appreciate the good vibes, and good luck on your own build too