frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

GPT needs a truth-first toggle for technical workflows

1•PAdvisory•6mo ago
I use GPT-4 extensively for technical work: coding, debugging, modeling complex project logic. The biggest issue isn’t hallucination—it’s that the model prioritizes being helpful and polite over being accurate.

The default behavior feels like this:

Safety

Helpfulness

Tone

Truth

Consistency

In a development workflow, this is backwards. I’ve lost entire days chasing errors caused by GPT confidently guessing things it wasn’t sure about—folder structures, method syntax, async behaviors—just to “sound helpful.”

What’s needed is a toggle (UI or API) that:

Forces “I don’t know” when certainty is missing

Prevents speculative completions

Prioritizes truth over style, when safety isn’t at risk

Keeps all safety filters and tone alignment intact for other use cases

This wouldn’t affect casual users or conversational queries. It would let developers explicitly choose a mode where accuracy is more important than fluency.

This request has also been shared through OpenAI's support channels. Posting here to see if others have run into the same limitation or worked around it in a more reliable way than I have found

Comments

duxup•6mo ago
I’ve found this with many LLMs they want to give an answer, even if wrong.

Gemini on the Google search page constantly answers questions yes or no… and then the evidence it gives indicates the opposite of the answer.

I think the core issue is that in the end LLMs are just word math and they don’t “know” if they don’t “know”…. they just string words together and hope for the best.

PAdvisory•6mo ago
I went into it pretty in depth after breaking a few with severe constraints, what it seems to come down to is how the platforms themselves prioritize functions, MOST put "helpfulness" and "efficiency" ABOVE truth, which then leads the LLM to make a lot of "guesses" and "predictions". At their core pretty much ALL LLM's are made to "predict" the information in answers, but they CAN actually avoid that and remain consistent when heavily constrained. The issue is that it isn't at the core level, so we have to CONSTANTLY retrain it over and over I find
Ace__•6mo ago
I have made something that addresses this. Not ready to share it yet, but soon-ish. At the moment it only works on GPT model 4o. I tried local Q4 KM's models, on LM Studio, but complete no go.

Package Manager Design Tradeoffs

https://nesbitt.io/2025/12/05/package-manager-tradeoffs.html
1•todsacerdoti•47s ago•0 comments

Is Pixelfed sawing off the branch that the Fediverse is sitting on?

https://ploum.net/2025-12-04-pixelfed-against-fediverse.html
1•8organicbits•1m ago•0 comments

Grok 4.20 tops alpha arena trading benchmark

https://nof1.ai/leaderboard
1•knuppar•2m ago•0 comments

Show HN: I built a local-first URL redirector to stop doomscrolling

https://github.com/jordanblakey/url-redirector
1•jordan_blakey•2m ago•1 comments

Girls and boys solve math problems differently – with similar short-term results

https://theconversation.com/girls-and-boys-solve-math-problems-differently-with-similar-short-ter...
1•nradov•10m ago•1 comments

Flow Control: a programmer's text editor

https://flow-control.dev/
1•signa11•17m ago•0 comments

Coverd – gambling with your credit card purchases

https://www.coverd.us/about
1•smsm42•18m ago•0 comments

AI is mastering language. Should we trust what it says? (2022)

https://www.nytimes.com/2022/04/15/magazine/ai-language.html
2•maxutility•19m ago•0 comments

Pdsink: USB Power Delivery Sink library for embedded devices

https://github.com/pdsink/pdsink
1•zdw•27m ago•0 comments

Scientists Link Popular Sugar Substitute (Sorbitol) to Liver Disease

https://scitechdaily.com/scientists-link-popular-sugar-substitute-to-liver-disease/
2•mraniki•31m ago•0 comments

'The time has come to declare war on AI'

https://www.sfgate.com/tech/article/time-to-declare-war-ai-21221535.php
5•MilnerRoute•34m ago•0 comments

Trying VLLM Ideas on Apple Silicon with MLX (WIP)

https://github.com/waybarrios/vllm-mlx
1•waybarrios•38m ago•1 comments

AI Structural Redesign Proven on Gemini/Copilot

https://imgur.com/a/A8x18kc
1•korea_koh•38m ago•1 comments

Show HN: GitHired – Find Your Next 10x Engineer

https://www.githired.tech
4•raghavbansal11•51m ago•11 comments

Why Does A.I. Write Like That?

https://www.nytimes.com/2025/12/03/magazine/chatbot-writing-style.html
1•gmays•52m ago•0 comments

Rio de Janeiro's talipot palm trees bloom for the first and only time

https://en.jardineriaon.com/The-talipot-palm-trees-of-Rio-de-Janeiro-bloom-for-the-first-and-only...
2•1659447091•58m ago•0 comments

Show HN: TestPlanit – an open-source test case management system built for QA

https://demo.testplanit.com/en-US/signin
1•therealbrad•59m ago•0 comments

Bad Boy for Life: Sean Combs' History of Violence

https://www.rollingstone.com/music/music-features/diddy-friends-bad-boy-artists-abuse-violence-12...
1•handfuloflight•59m ago•0 comments

Platonic space: where cognitive and morphological patterns come from

https://thoughtforms.life/platonic-space-where-cognitive-and-morphological-patterns-come-from-bes...
1•andsoitis•1h ago•0 comments

Ask HN: What do you usually do while waiting for AI responses?

2•alfred_chang•1h ago•4 comments

Control-Alt-Delete of a Life – By Steven K Roberts

https://nomadicresearchlabs.substack.com/p/control-alt-delete-of-a-life
1•pkaeding•1h ago•0 comments

Japan protests after Chinese military aircraft locks its radar on Japanese jets

https://apnews.com/article/japan-china-military-fighter-jets-pacific-25017ddbec3afd6bf9e6da4b8516...
3•c420•1h ago•0 comments

Discovering the Indieweb with Calm Tech

https://alexsci.com/blog/calm-tech-discover/
4•todsacerdoti•1h ago•0 comments

Deconstructing Dollar Dynamics: A State-Dependent,Non-Linear Integrated Equation

https://zenodo.org/records/17790383
1•truongthanminh•1h ago•1 comments

Apple Is Experiencing Its Biggest Leadership Exodus

https://fortune.com/2025/12/05/apple-executive-leadership-exodus-biggest-shakeup-since-steve-jobs...
4•ent101•1h ago•0 comments

Anatomy of a Domain Risk Engine: Regex vs. LLMs

https://www.urlert.com/blog/anatomy-domain-risk-engine-regex-llm
1•tomerhe•1h ago•1 comments

SC sheriff's office quoted me $9k for a simple Flock records request

https://columbiamuckraker.substack.com/p/sc-sheriffs-office-quoted-me-9000
4•sc_muckraker•1h ago•2 comments

Z2 – Lithographically fabricated IC in a garage fab

https://sam.zeloof.xyz/second-ic/
35•embedding-shape•1h ago•3 comments

AstoCAD – Polished, paid "soft-fork" of FreeCAD with upstream contributions

https://www.astocad.com/
1•embedding-shape•1h ago•0 comments

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

https://www.troyhunt.com/why-does-have-i-been-pwned-contain-fake-email-addresses/
3•LorenDB•2h ago•0 comments