frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

AI hallucinate. Do you ever double check the output?

6•jackota•1h ago
Been building AI workflows and then randomly hallucinate and do something stupid so I end up manually checking everything anyway to approve the AI generated content (messages, emails, invoices,ecc.), which defeats the whole point.

Anyone else? How did you manage it?

Comments

AlexeyBrin•1h ago
You can't 100% be sure the AI won't hallucinate. If you don't want to manually check it, you can have a different AI check it and if it finds something suspect flag it for a human to verify it. Even better have 2 different AIs check the output and if they don't agree flag it.
Gioppix•1h ago
I also don't trust LLMs, but I still find automations useful. Even with human-in-the-loop they save a bunch of time. Clicking "Approve & Send" is much quicker than manually writing out the email, and I just rewrite the 5% that contains hallucinations.
Zigurd•1h ago
You have put your finger on why agent assisted coding often doesn't suck, and other use cases of LLMs often do suck. Lint and the compiler get there licks in before you even smoke test the code. There aren't two layers of deterministic, algorithmic checking for your emails or invoices.

So before anyone concludes that coding agents prove that AI can be useful, find some use cases with similar characteristics.

7777777phil•1h ago
I have been building Research automation with LangGraph for the past 2 months. We always put a human in the loop checkpoint after each critical step, might be annoying now but I think it will save us long-term.
codingdave•51m ago
Ever? More like always. Keeping humans in the loop is the current best practice. If you truly need to automate something that cannot afford a human checkpoint, find a deterministic solution for it, not LLMs.
varshith17•14m ago
Build validation layers, not trust. For structured outputs (invoices, emails), use JSON schemas + fact-checking prompts where a second AI call verifies critical fields against source data before you see it. Real pattern: AI generates → automated validation catches type/format errors → second LLM does adversarial review ("check for hallucinated numbers/dates") → you review only flagged items + random samples. Turns "check everything" into "check exceptions," cuts review time 80%.
exabrial•7m ago
The new guys on my team do not check it. They already had problems checking their work, AI is just amplifying the actual human problem.

Deregulation is not the answer to the affordable housing crisis

https://48hills.org/2026/01/new-study-shows-that-deregulation-is-not-the-answer-to-the-affordable...
1•masterofsome•12s ago•0 comments

Digital Sovereignty: Why Tech Execs Must Act Now

https://www.forrester.com/blogs/digital-sovereignty-why-tech-execs-must-act-now/
1•doener•1m ago•0 comments

New 3D Mapping website (uses GMP)

https://www.easy3dmaps.com/gallery
1•dobodob•2m ago•0 comments

If agents use your tool, you need evals

https://tessl.io/blog/why-you-need-evals/
1•nadis•4m ago•0 comments

Lawmakers Hold Hearing on the Impact of Screen Time on Kids [video]

https://www.c-span.org/program/senate-committee/lawmakers-hold-hearing-on-the-impact-of-screen-ti...
1•Group_B•5m ago•0 comments

Killing the ISP Appliance: An eBPF/XDP Approach to Distributed BNG

https://markgascoyne.co.uk/posts/ebpf-bng/
1•chaz6•7m ago•0 comments

Show HN: A self-hosting C compiler that runs on a Raspberry Pi Pico

https://github.com/ezulabs/MicroCC
2•ezulabs•8m ago•0 comments

Golden Ratio using an equilateral triangle inscribed in a circle

https://geometrycode.com/free/how-to-graphically-derive-the-golden-ratio-using-an-equilateral-tri...
4•peter_d_sherman•9m ago•0 comments

Nobody Likes Lag: How to Make Low-Latency Dev Sandboxes

https://www.compyle.ai/blog/nobody-likes-lag/
2•mnazzaro•10m ago•0 comments

Notes on the Intel 8086 processor's arithmetic-logic unit

https://www.righto.com/2026/01/notes-on-intel-8086-processors.html
1•elpocko•10m ago•0 comments

TikTok Is a Propaganda Tool. Anyway, Let's Build Monsters

https://blog.adafruit.com/2026/01/23/tiktok-is-a-propaganda-tool-anyway-lets-build-monsters-and-m...
2•ptorrone•10m ago•0 comments

Working on reducing wasted distribution effort before publishing posts

1•ryujii•12m ago•0 comments

Show HN: Manager List is now live!!!

https://managerlist.com
1•itsmiketu•13m ago•0 comments

Jason Calacanis' Warning To Y Combinator Startups (2010) [video]

https://www.youtube.com/watch?v=2cdrCYrZIvI
1•eamag•13m ago•1 comments

Show HN: VSCode Extension for E2B Sandbox

https://marketplace.visualstudio.com/items?itemName=bhavaniravi.e2b-sandbox-explorer
1•bhavaniravi•13m ago•0 comments

Show HN: Cholesterol Tracker – Built after high cholesterol diagnosis at 33

https://cholesterol-tracker.poniansoft.com/
2•briskibe•13m ago•0 comments

Teemux: Zero-config log multiplexer with built-in MCP server

https://github.com/gajus/teemux
1•todsacerdoti•15m ago•0 comments

My Journey From Foreign Correspondent to Uber Driver

https://stevescherer.substack.com/p/my-journey-from-foreign-correspondent
2•gaws•16m ago•0 comments

PowerShell architect retires after decades at the prompt

https://www.theregister.com/2026/01/22/powershell_snover_retires/
1•Bender•16m ago•0 comments

Newpipe.net is down – DNS resolution is failing

https://github.com/TeamNewPipe/website/issues/420
3•gumarn_y•17m ago•4 comments

Microsoft 365 outage drags on for nearly 10 hours during bad night

https://www.theregister.com/2026/01/23/microsoft_365_outage/
1•Bender•17m ago•0 comments

Linux 6.19 Scheduler Feature Being Disabled Due to Performance Regressions

https://www.phoronix.com/news/Linux-6.19-Disabling-Next-Buddy
1•Bender•18m ago•0 comments

John Carmack prediction on AGI (2019) [video]

https://www.youtube.com/watch?v=udlMSe5-zP8
1•eamag•18m ago•0 comments

The Duck Game Chat for Nintendo Switch

https://pond.gg/
1•surprisetalk•19m ago•0 comments

Where to Sleep in Lax

https://cadence.moe/blog/2025-12-30-where-to-sleep-in-lax
1•surprisetalk•19m ago•0 comments

My review of the Nüborn Baby at 3 months

https://joshcollinsworth.com/blog/baby-review
1•surprisetalk•19m ago•0 comments

Enosuchblog

https://blog.yossarian.net/
1•surprisetalk•19m ago•0 comments

Please Unsubscribe

https://www.youtube.com/watch?v=pYQjJ0Oov8g
1•iamflimflam1•20m ago•0 comments

Taking walks to help reduce stress and increase creativity

https://ovidem.com/blog/benefits-of-walking/
2•ovidem•22m ago•0 comments

A Green 'Monster' Hides on a Quiet London Street

https://www.nytimes.com/2026/01/21/realestate/a-green-monster-hides-on-a-quiet-london-street.html
1•tintinnabula•22m ago•0 comments