news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

LLM Hallucinations in Practical Code Generation

https://dl.acm.org/doi/10.1145/3728894

44•appwiz•2d ago

Comments

th0ma5•3h ago

I thought this was a very good read about the many of the issues that are faced without having any ground truth to reason against. It is interesting how many different ways people have developed to work around missing information, and the marginal improvements it makes in some benchmarks.

imiric•2h ago

This is great. We need more research into solving this fundamental problem, yet AI companies prefer to chase benchmarks and pump out value-added products.

The RAG-based mitigation is interesting, but quite limited, as mentioned. It would only work if the user can provide ground truth data, which for code generation is relatively straightforward, but it's much more difficult for most other factual information. We can't directly rely on data from the web, since the sources need to be carefully reviewed by a human first, which is the labor-intensive task that requires human domain experts.

So this approach seems like a band-aid, and wouldn't be generally applicable. I'm not in the AI industry, but from the perspective of a user it seems that the hallucination problem requires a much more foundational solution.

dheera•2h ago

On the other hand, I think it's cute when LLMs hallucinate a Python library that doesn't exist, because it probably means it's worth creating into existence.

nerdjon•1h ago

My favorite is trying to use it to generate an IAM policy and keys are just hallucinated based on expectations of what the keys would be called and are either wrong or they flat out don't exist if you are dealing with more advanced conditions.

QEMU: Define policy forbidding use of AI code generators

https://github.com/qemu/qemu/commit/3d40db0efc22520fa6c399cf73960dced423b048

142•todsacerdoti•1h ago•77 comments

A new pyramid-like shape always lands the same side up

https://www.quantamagazine.org/a-new-pyramid-like-shape-always-lands-the-same-side-up-20250625/

241•robinhouston•5h ago•64 comments

The Hollow Men of Hims

https://www.alexkesin.com/p/the-hollow-men-of-hims

79•quadrin•2h ago•42 comments

-2000 Lines of code

https://www.folklore.org/Negative_2000_Lines_Of_Code.html

179•xeonmc•5h ago•65 comments

Gemini CLI

https://blog.google/technology/developers/introducing-gemini-cli-open-source-ai-agent/

926•sync•12h ago•525 comments

A new PNG spec

https://www.programmax.net/articles/png-is-back/

463•bluedel•1d ago•456 comments

What Problems to Solve (1966)

http://genius.cat-v.org/richard-feynman/writtings/letters/problems

297•jxmorris12•8h ago•37 comments

OpenAI charges by the minute, so speed up your audio

https://george.mand.is/2025/06/openai-charges-by-the-minute-so-make-the-minutes-shorter/

431•georgemandis•11h ago•128 comments

The Offline Club

https://www.theoffline-club.com

77•esher•5h ago•33 comments

Getting ready to issue IP address certificates

https://community.letsencrypt.org/t/getting-ready-to-issue-ip-address-certificates/238777

208•Bogdanp•8h ago•118 comments

Build and Host AI-Powered Apps with Claude – No Deployment Needed

https://www.anthropic.com/news/claude-powered-artifacts

165•davidbarker•8h ago•66 comments

Writing a basic Linux device driver when you know nothing about Linux drivers

https://crescentro.se/posts/writing-drivers/

156•sbt567•3d ago•13 comments

Better Auth, by a self-taught Ethiopian dev, raises $5M from Peak XV, YC

https://techcrunch.com/2025/06/25/this-self-taught-ethiopian-dev-built-an-authentication-tool-and-got-into-yc/

53•bundie•7h ago•30 comments

Libxml2's "no security embargoes" policy

https://lwn.net/SubscriberLink/1025971/73f269ad3695186d/

121•jwilk•5h ago•80 comments

Ambient Garden

https://ambient.garden

23•fipar•2d ago•2 comments

Earths largest camera:3B pixel images

https://www.nytimes.com/interactive/2025/06/19/science/rubin-observatory-camera.html

7•wglb•3d ago•3 comments

LM Studio is now an MCP Host

https://lmstudio.ai/blog/lmstudio-v0.3.17

145•yags•7h ago•57 comments

Deep Research as a Swim Coach

https://suthakamal.substack.com/p/swimming-with-an-ai-coach

21•suthakamal•2d ago•3 comments

Iroh: A library to establish direct connection between peers

https://github.com/n0-computer/iroh

139•gasull•8h ago•37 comments

IBM's Dmitry Krotov wants to crack the 'physics' of memory

https://research.ibm.com/blog/dmitry-krotov-ai-physics

14•bookofjoe•2h ago•1 comments

America’s incarceration rate is in decline

https://www.theatlantic.com/ideas/archive/2025/06/prisoner-populations-are-plummeting/683310/

80•paulpauper•8h ago•161 comments

CUDA Ray Tracing 2x Faster Than RTX: My CUDA Ray Tracing Journey

https://karimsayedre.github.io/RTIOW.html

24•ibobev•3h ago•2 comments

FurtherAI (YC W24) Is Hiring for Software and AI Roles

https://www.ycombinator.com/companies/furtherai/jobs

1•sgondala_ycapp•8h ago

Web Embeddable Common Lisp

https://turtleware.eu/static/paste/wecl-test-gl/main.html

98•todsacerdoti•9h ago•33 comments

Building a Monostable Tetrahedron

https://arxiv.org/abs/2506.19244

31•robinhouston•5h ago•2 comments

Interstellar Flight: Perspectives and Patience

https://www.centauri-dreams.org/2025/06/25/interstellar-flight-perspectives-and-patience/

54•JPLeRouzic•8h ago•87 comments

Games run faster on SteamOS than Windows 11, Ars testing finds

https://arstechnica.com/gaming/2025/06/games-run-faster-on-steamos-than-windows-11-ars-testing-finds/

185•_JamesA_•5h ago•56 comments

Bot or human? Creating an invisible Turing test for the internet

https://research.roundtable.ai/proof-of-human/

90•timshell•10h ago•120 comments

LLM Hallucinations in Practical Code Generation

https://dl.acm.org/doi/10.1145/3728894

44•appwiz•2d ago•4 comments

Is Lovable getting monetization wrong?

https://getlago.substack.com/p/lovable-makes-60m-in-6-monthsbut

100•FinnLobsien•11h ago•62 comments