frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

GPT needs a truth-first toggle for technical workflows

1•PAdvisory•1y ago
I use GPT-4 extensively for technical work: coding, debugging, modeling complex project logic. The biggest issue isn’t hallucination—it’s that the model prioritizes being helpful and polite over being accurate.

The default behavior feels like this:

Safety

Helpfulness

Tone

Truth

Consistency

In a development workflow, this is backwards. I’ve lost entire days chasing errors caused by GPT confidently guessing things it wasn’t sure about—folder structures, method syntax, async behaviors—just to “sound helpful.”

What’s needed is a toggle (UI or API) that:

Forces “I don’t know” when certainty is missing

Prevents speculative completions

Prioritizes truth over style, when safety isn’t at risk

Keeps all safety filters and tone alignment intact for other use cases

This wouldn’t affect casual users or conversational queries. It would let developers explicitly choose a mode where accuracy is more important than fluency.

This request has also been shared through OpenAI's support channels. Posting here to see if others have run into the same limitation or worked around it in a more reliable way than I have found

Comments

duxup•1y ago
I’ve found this with many LLMs they want to give an answer, even if wrong.

Gemini on the Google search page constantly answers questions yes or no… and then the evidence it gives indicates the opposite of the answer.

I think the core issue is that in the end LLMs are just word math and they don’t “know” if they don’t “know”…. they just string words together and hope for the best.

PAdvisory•1y ago
I went into it pretty in depth after breaking a few with severe constraints, what it seems to come down to is how the platforms themselves prioritize functions, MOST put "helpfulness" and "efficiency" ABOVE truth, which then leads the LLM to make a lot of "guesses" and "predictions". At their core pretty much ALL LLM's are made to "predict" the information in answers, but they CAN actually avoid that and remain consistent when heavily constrained. The issue is that it isn't at the core level, so we have to CONSTANTLY retrain it over and over I find
Ace__•12mo ago
I have made something that addresses this. Not ready to share it yet, but soon-ish. At the moment it only works on GPT model 4o. I tried local Q4 KM's models, on LM Studio, but complete no go.

Publishing's Latest Piracy Problem: Audiobooks on YouTube

https://www.nytimes.com/2026/05/21/books/audiobook-piracy-youtube.html
1•thm•1m ago•0 comments

Judge considers ordering Meta to revamp its apps

https://www.politico.com/news/2026/05/22/meta-judge-trial-public-nuisance-facebook-00934485
1•1vuio0pswjnm7•1m ago•0 comments

Crypto industry braces for quantum computing threat

https://www.ft.com/content/99c1c1e7-1a1c-479c-9fc8-e21aea5c3f0e
2•thm•3m ago•0 comments

Waymo suspends all freeway rides over safety

https://www.latimes.com/business/story/2026-05-22/waymo-suspends-all-freeway-rides-over-safety
1•1vuio0pswjnm7•4m ago•0 comments

Starbucks scraps AI inventory tool after nine months

https://qz.com/starbucks-scraps-ai-inventory-tool-nomadgo-052226
1•thunderbong•4m ago•0 comments

In India, You Can Get Milk Delivered Faster Than It Takes to Make Coffee

https://www.wsj.com/business/logistics/in-india-you-can-get-milk-delivered-faster-than-it-takes-t...
1•JumpCrisscross•5m ago•0 comments

Google appeals search monopoly ruling, says it won business 'fair and square'

https://www.theverge.com/policy/936175/google-search-monopoly-ruling-appeal
1•thm•5m ago•0 comments

Cannes Film Cost $500k to Make. $400k Was AI Compute Costs

https://www.wsj.com/cio-journal/this-cannes-film-cost-500-000-to-make-400-000-was-ai-compute-cost...
2•JumpCrisscross•7m ago•0 comments

How to Be a Real Elite Programmer

https://skorks.com/2010/05/how-to-be-a-real-elite-programmer-and-make-sure-everybody-knows-it/
2•mahirsaid•10m ago•0 comments

The Fonts of the U.S. Federal Courts

https://daringfireball.net/2026/05/the_fonts_of_the_us_federal_courts
1•Tomte•12m ago•0 comments

Microsoft's new multi-model agentic security system tops leading benchmark

https://www.microsoft.com/en-us/security/blog/2026/05/12/defense-at-ai-speed-microsofts-new-multi...
2•uniclaude•19m ago•0 comments

Lisa's Copy (and Cut, and Paste)

https://unsung.aresluna.org/lisas-copy-and-cut-and-paste/
1•zdw•23m ago•0 comments

Is U.S. AI Adoption Plateauing? A Comprehensive Analysis

https://medium.com/@markchen69/is-u-s-ai-adoption-plateauing-a-comprehensive-analysis-cf5c1beef8cf
1•mgh2•26m ago•0 comments

Introducing BDD (2006)

https://dannorth.net/blog/introducing-bdd/
1•locknitpicker•27m ago•0 comments

Computing the billionth prime in 1s with LLVM IR

https://github.com/SheafificationOfG/QueenJewels
1•Murfalo•29m ago•1 comments

Is AI Becoming Too Smart for Its Own Good? [audio]

https://rss.com/podcasts/nuclecast-podcast/2811653/
1•apolloartemis•30m ago•1 comments

Gelatine Sculpt: Can This "Gelatin Trick" Transform Your Fat?

https://finance.yahoo.com/sectors/healthcare/articles/gelatine-sculpt-exploding-2026-viral-142500...
2•sarkpauz•32m ago•1 comments

Huawei's new stacking tech for high-capacity SSDs

https://www.blocksandfiles.com/flash/2026/05/21/huaweis-new-stacking-tech-for-high-capacity-ssds/...
1•yogthos•34m ago•0 comments

Engineers vs. Psychiatrists (C. P. Snow)

https://unintendedconsequenc.es/engineers-vs-psychiatrists-c-p-snow/
3•paulorlando•38m ago•0 comments

The Mythical App Store Reviewer Month

https://lapcatsoftware.com/articles/2026/5/4.html
1•zdw•42m ago•0 comments

Ask HN: How to increase depth instead of breadth as 10 yoe as swe?

3•Cheesebh•44m ago•0 comments

Tell HN: Stop building software for people, build it for agents instead

1•keepamovin•45m ago•4 comments

Supply Chain Attack Targets Laravel-Lang Packages with Credential Stealer

https://www.aikido.dev/blog/supply-chain-attack-targets-laravel-lang-packages-with-credential-ste...
2•nullbio•50m ago•1 comments

Deepsec: The security harness for finding vulnerabilities in your codebase

https://vercel.com/blog/introducing-deepsec-find-and-fix-vulnerabilities-in-your-code-base
1•882542F3884314B•55m ago•0 comments

Perplexity Bumblebee: Read-Only Tool for Dev Supply Chain Checks on macOS/Linux

https://github.com/perplexityai/bumblebee
2•882542F3884314B•58m ago•0 comments

Show HN: ThinkLLM, A knowledge graph of AI models (HTTPS://thinkllm.dev)

https://thinkllm.dev
1•gkanellopoulos•59m ago•0 comments

All Model Labs Are Now Agent Labs

https://www.latent.space/p/ainews-all-model-labs-are-now-agent
1•swyx•1h ago•0 comments

Zero-dependency CLI that converts LinkedIn exports into Markdown for LLMs

https://linkedin2md.daza.ar/
1•juanmanueldaza•1h ago•0 comments

Show HN: Waiting for AI Grand Prix racing SIM? Me too So I made one

https://github.com/elodin-sys/ai-grand-prix
7•danAtElodin•1h ago•0 comments

The SpaceX IPO filing is filled with AI bets, Starship dreams

https://techcrunch.com/2026/05/20/the-spacex-ipo-filing-ai-bets-starship-dreams-elon-musk/
1•dotcoma•1h ago•0 comments