frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

https://arxiv.org/abs/2601.14340
1•PaulHoule•25s ago•0 comments

Show HN: AI Agent Tool That Keeps You in the Loop

https://github.com/dshearer/misatay
1•dshearer•1m ago•0 comments

Why Every R Package Wrapping External Tools Needs a Sitrep() Function

https://drmowinckels.io/blog/2026/sitrep-functions/
1•todsacerdoti•2m ago•0 comments

Achieving Ultra-Fast AI Chat Widgets

https://www.cjroth.com/blog/2026-02-06-chat-widgets
1•thoughtfulchris•3m ago•0 comments

Show HN: Runtime Fence – Kill switch for AI agents

https://github.com/RunTimeAdmin/ai-agent-killswitch
1•ccie14019•6m ago•1 comments

Researchers surprised by the brain benefits of cannabis usage in adults over 40

https://nypost.com/2026/02/07/health/cannabis-may-benefit-aging-brains-study-finds/
1•SirLJ•8m ago•0 comments

Peter Thiel warns the Antichrist, apocalypse linked to the 'end of modernity'

https://fortune.com/2026/02/04/peter-thiel-antichrist-greta-thunberg-end-of-modernity-billionaires/
1•randycupertino•8m ago•2 comments

USS Preble Used Helios Laser to Zap Four Drones in Expanding Testing

https://www.twz.com/sea/uss-preble-used-helios-laser-to-zap-four-drones-in-expanding-testing
2•breve•14m ago•0 comments

Show HN: Animated beach scene, made with CSS

https://ahmed-machine.github.io/beach-scene/
1•ahmedoo•14m ago•0 comments

An update on unredacting select Epstein files – DBC12.pdf liberated

https://neosmart.net/blog/efta00400459-has-been-cracked-dbc12-pdf-liberated/
1•ks2048•14m ago•0 comments

Was going to share my work

1•hiddenarchitect•18m ago•0 comments

Pitchfork: A devilishly good process manager for developers

https://pitchfork.jdx.dev/
1•ahamez•18m ago•0 comments

You Are Here

https://brooker.co.za/blog/2026/02/07/you-are-here.html
3•mltvc•22m ago•0 comments

Why social apps need to become proactive, not reactive

https://www.heyflare.app/blog/from-reactive-to-proactive-how-ai-agents-will-reshape-social-apps
1•JoanMDuarte•23m ago•1 comments

How patient are AI scrapers, anyway? – Random Thoughts

https://lars.ingebrigtsen.no/2026/02/07/how-patient-are-ai-scrapers-anyway/
1•samtrack2019•23m ago•0 comments

Vouch: A contributor trust management system

https://github.com/mitchellh/vouch
2•SchwKatze•23m ago•0 comments

I built a terminal monitoring app and custom firmware for a clock with Claude

https://duggan.ie/posts/i-built-a-terminal-monitoring-app-and-custom-firmware-for-a-desktop-clock...
1•duggan•24m ago•0 comments

Tiny C Compiler

https://bellard.org/tcc/
1•guerrilla•26m ago•0 comments

Y Combinator Founder Organizes 'March for Billionaires'

https://mlq.ai/news/ai-startup-founder-organizes-march-for-billionaires-protest-against-californi...
1•hidden80•26m ago•2 comments

Ask HN: Need feedback on the idea I'm working on

1•Yogender78•27m ago•0 comments

OpenClaw Addresses Security Risks

https://thebiggish.com/news/openclaw-s-security-flaws-expose-enterprise-risk-22-of-deployments-un...
2•vedantnair•27m ago•0 comments

Apple finalizes Gemini / Siri deal

https://www.engadget.com/ai/apple-reportedly-plans-to-reveal-its-gemini-powered-siri-in-february-...
1•vedantnair•28m ago•0 comments

Italy Railways Sabotaged

https://www.bbc.co.uk/news/articles/czr4rx04xjpo
6•vedantnair•28m ago•2 comments

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•fanf2•30m ago•0 comments

Nintendo Wii Themed Portfolio

https://akiraux.vercel.app/
2•s4074433•34m ago•2 comments

"There must be something like the opposite of suicide "

https://post.substack.com/p/there-must-be-something-like-the
1•rbanffy•36m ago•0 comments

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

2•amichail•37m ago•0 comments

Show HN: Engineering Perception with Combinatorial Memetics

1•alan_sass•43m ago•2 comments

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

https://steamdaily.xyz
1•itshellboy•45m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
2•spenvo•45m ago•0 comments
Open in hackernews

AI hallucinate. Do you ever double check the output?

8•jackota•2w ago
Been building AI workflows and then randomly hallucinate and do something stupid so I end up manually checking everything anyway to approve the AI generated content (messages, emails, invoices,ecc.), which defeats the whole point.

Anyone else? How did you manage it?

Comments

AlexeyBrin•2w ago
You can't 100% be sure the AI won't hallucinate. If you don't want to manually check it, you can have a different AI check it and if it finds something suspect flag it for a human to verify it. Even better have 2 different AIs check the output and if they don't agree flag it.
Gioppix•2w ago
I also don't trust LLMs, but I still find automations useful. Even with human-in-the-loop they save a bunch of time. Clicking "Approve & Send" is much quicker than manually writing out the email, and I just rewrite the 5% that contains hallucinations.
andrei_says_•2w ago
Why not just write the 5% that contains meaningful communication?

Spare the recipients from reading generated filler / slop?

Giosco•1w ago
I meant 5% of the emails, not 5% of the email content. Agree with you that most of the AI generated content is 100% slop; however, you can prompt engineer until it produces meaningful messages.
Zigurd•2w ago
You have put your finger on why agent assisted coding often doesn't suck, and other use cases of LLMs often do suck. Lint and the compiler get there licks in before you even smoke test the code. There aren't two layers of deterministic, algorithmic checking for your emails or invoices.

So before anyone concludes that coding agents prove that AI can be useful, find some use cases with similar characteristics.

7777777phil•2w ago
I have been building Research automation with LangGraph for the past 2 months. We always put a human in the loop checkpoint after each critical step, might be annoying now but I think it will save us long-term.
Gioppix•1w ago
how have you implemented that?
7777777phil•1w ago
Check LangGraph HIL Doc: https://docs.langchain.com/oss/python/langchain/human-in-the...

the implementation we are building is open source: https://github.com/giatenica/gia-agentic-short-v2

codingdave•2w ago
Ever? More like always. Keeping humans in the loop is the current best practice. If you truly need to automate something that cannot afford a human checkpoint, find a deterministic solution for it, not LLMs.
Gioppix•1w ago
what's your workflow for the human review?
varshith17•2w ago
Build validation layers, not trust. For structured outputs (invoices, emails), use JSON schemas + fact-checking prompts where a second AI call verifies critical fields against source data before you see it. Real pattern: AI generates → automated validation catches type/format errors → second LLM does adversarial review ("check for hallucinated numbers/dates") → you review only flagged items + random samples. Turns "check everything" into "check exceptions," cuts review time 80%.
casualscience•2w ago
Also lets 50% of errors through
exabrial•2w ago
The new guys on my team do not check it. They already had problems checking their work, AI is just amplifying the actual human problem.
19arjun89•2w ago
At this point, we are not there yet in terms of letting AI make business critical decisions based on its own outputs. Its meant to serve as a decision support system rather than a decision maker.

To minimize hallucinations, yes AI should be set up for deterministic behaviour (depending on your use case, for example, in recruiter, yes it should be deterministic so it produces the same evaluation for the same candidate every time). Secondly, having another AI check hallucination can be a good starting point, assigning scores and penalizing the first AI can also lead to more grounded responses.

aavci•2w ago
In my opinion, the way this will play out is with a significant amount of validation and human oversight to fully utilize these LLMs. As you mentioned, I recommend giving the AI room for error and improving the experience of manually checking everything. Maybe create a tool to facilitate manually checking the output?

This is a valuable read: https://www.ufried.com/blog/ironies_of_ai_1/

Gioppix•1w ago
what are you current go-to tools to speed up the review?
prepend•2w ago
Yes, of course I review everything.

I treat it like hiring a consultant. They do a lot of work, but I still review the output before making a decision or passing it on.

Sending something with errors to my boss or peers makes me look stupid. Saying it was caused by unrevised AI makes me look stupider.

Gioppix•1w ago
how did you implement human in the loop?
wormpilled•2w ago
> which defeats the whole point.

Not at all

jackfranklyn•2w ago
The validation layer point is key. Where things actually work is when you can define what 'correct' looks like - invoice numbers either exist or don't, amounts either reconcile against known data or they don't, email addresses either parse or fail.

The trap is when correctness is subjective. Tone, phrasing, whether something 'sounds right' - no automated check helps there, so you're back to reviewing everything.

For structured data like invoices, I've found pattern-matching against known values beats LLMs anyway. Less hallucination risk, faster, and when it fails at least it fails obviously rather than confidently wrong.

Xorakios•2w ago
FWIW, I utilize Perplexity a lot, and Gemini occasionally for what we old geezer call spitballing.

Part of the reason I like Perplexity is because of the embedded references, and I always, always, double check the sources and holler at the Perp AI when it is clearly confabulating or misinterpreting. Still gives me insights and is useful, but trust-but-verify isn't just about arms control ;)

muzani•2w ago
At this point, everyone does. It's a habit like checking where the link goes before clicking it.