frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

GPT needs a truth-first toggle for technical workflows

1•PAdvisory•6mo ago
I use GPT-4 extensively for technical work: coding, debugging, modeling complex project logic. The biggest issue isn’t hallucination—it’s that the model prioritizes being helpful and polite over being accurate.

The default behavior feels like this:

Safety

Helpfulness

Tone

Truth

Consistency

In a development workflow, this is backwards. I’ve lost entire days chasing errors caused by GPT confidently guessing things it wasn’t sure about—folder structures, method syntax, async behaviors—just to “sound helpful.”

What’s needed is a toggle (UI or API) that:

Forces “I don’t know” when certainty is missing

Prevents speculative completions

Prioritizes truth over style, when safety isn’t at risk

Keeps all safety filters and tone alignment intact for other use cases

This wouldn’t affect casual users or conversational queries. It would let developers explicitly choose a mode where accuracy is more important than fluency.

This request has also been shared through OpenAI's support channels. Posting here to see if others have run into the same limitation or worked around it in a more reliable way than I have found

Comments

duxup•6mo ago
I’ve found this with many LLMs they want to give an answer, even if wrong.

Gemini on the Google search page constantly answers questions yes or no… and then the evidence it gives indicates the opposite of the answer.

I think the core issue is that in the end LLMs are just word math and they don’t “know” if they don’t “know”…. they just string words together and hope for the best.

PAdvisory•6mo ago
I went into it pretty in depth after breaking a few with severe constraints, what it seems to come down to is how the platforms themselves prioritize functions, MOST put "helpfulness" and "efficiency" ABOVE truth, which then leads the LLM to make a lot of "guesses" and "predictions". At their core pretty much ALL LLM's are made to "predict" the information in answers, but they CAN actually avoid that and remain consistent when heavily constrained. The issue is that it isn't at the core level, so we have to CONSTANTLY retrain it over and over I find
Ace__•6mo ago
I have made something that addresses this. Not ready to share it yet, but soon-ish. At the moment it only works on GPT model 4o. I tried local Q4 KM's models, on LM Studio, but complete no go.

Decenterlized Login System for a Media Server Project?

1•vxsz•2m ago•0 comments

The Era of Visual Studio Code

https://blog.robenkleene.com/2020/09/21/the-era-of-visual-studio-code/
1•handfuloflight•2m ago•0 comments

Workday project at Washington University hits $266M

https://www.theregister.com/2025/12/12/washington_university_workday_costs_revealed/
2•sebastian_z•11m ago•0 comments

Myna v2.0: contextual variants, more weights (and even supports APL)

https://github.com/sayyadirfanali/Myna/releases/tag/v2.0.0
2•todsacerdoti•13m ago•0 comments

Safari 26.2 Release Notes

https://developer.apple.com/documentation/safari-release-notes/safari-26_2-release-notes
1•ksec•13m ago•0 comments

Chrome Extension Manager

https://chromewebstore.google.com/detail/extension-manager-extensi/jafcieombbedhpdkjlhcggagepcgaihp
1•kaporalix•14m ago•1 comments

China's trade surplus tops record US$1T, defying trade war uncertainty

https://www.scmp.com/economy/economic-indicators/article/3335551/chinas-exports-rebound-november-...
4•ksec•15m ago•0 comments

Purdue University Approves New AI Requirement for All Undergrads

https://www.forbes.com/sites/michaeltnietzel/2025/12/13/purdue-university-approves-new-ai-require...
2•rmason•16m ago•0 comments

The rise and fall of Subway: The $11 billion empire that crumbled

https://www.youtube.com/watch?v=LeF6DG7xfpA
1•paulpauper•16m ago•1 comments

Human Agency: Protect your documents with hidden Anti AI directives

https://www.human-agency.xyz/
1•merinid•17m ago•1 comments

Sovereignty by Disruption: The Rise of Corporate Quasi-States

https://www.vinniefalco.com/p/synthetic-agency-displacement-disorder
2•hakilebara•17m ago•1 comments

PaperTrails: Your Personal Research Library

https://www.papertrailshq.com/
1•mhb•19m ago•1 comments

Ancient lake that vanished 100k years ago returns to California

https://www.dailymail.co.uk/sciencetech/article-15380617/lake-vanished-RETURNS-California-record-...
2•Bender•22m ago•0 comments

Want to sway an election? Here’s how much fake online accounts cost

https://www.science.org/content/article/want-sway-election-here-s-how-much-fake-online-accounts-cost
6•rbanffy•22m ago•0 comments

Reproducibility Test-Time Training on Nearest Neighbors for LLMs

https://arxiv.org/abs/2511.16691
1•PaulHoule•28m ago•0 comments

Australian teens were kicked off social media this week. Some are back already

https://www.cnn.com/2025/12/12/australia/australia-social-media-kids-intl-hnk-dst
3•Bender•30m ago•0 comments

RAMageddon is finally coming for your smartphones and laptops

https://www.tomsguide.com/computing/laptops/worsening-ram-crisis-starting-to-impact-smartphones-a...
3•elorant•31m ago•0 comments

More atmospheric rivers coming for flooded Washington and the West Coast

https://www.cnn.com/2025/12/12/weather/washington-west-coast-flooding-atmospheric-rivers-climate
2•Bender•32m ago•0 comments

Before You Cite That Study

https://eleganthack.com/before-you-cite-that-study/
2•adrianhoward•32m ago•1 comments

What happens when the coding becomes the least interesting part of the work

https://obie.medium.com/what-happens-when-the-coding-becomes-the-least-interesting-part-of-the-wo...
1•enraged_camel•35m ago•0 comments

A brief natural history of misinformation

https://royalsocietypublishing.org/rsif/article/22/233/20250161/364004/A-brief-natural-history-of...
1•geox•36m ago•0 comments

Enabling small language models to solve complex reasoning tasks

https://news.mit.edu/2025/enabling-small-language-models-solve-complex-reasoning-tasks-1212
2•LiveTheDream•37m ago•0 comments

Benchmarking LLMs on whether they can play FizzBuzz

https://github.com/venkatasg/fizzbuzz-llm
1•_venkatasg•38m ago•0 comments

Why Twilio Segment Moved from Microservices Back to a Monolith

https://www.twilio.com/en-us/blog/developers/best-practices/goodbye-microservices
2•birdculture•40m ago•0 comments

Faster Double-to-String Conversion

https://vitaut.net/posts/2025/faster-dtoa/
1•todsacerdoti•45m ago•0 comments

JD Vance: "You might try hiring Americans."

https://twitter.com/JDVance/status/1999880386898252030
8•SilverElfin•48m ago•16 comments

I Fed 24 Years of My Blog Posts to a Markov Model

https://susam.net/fed-24-years-of-posts-to-markov-model.html
6•zdw•50m ago•1 comments

The Rise of Computer Games, Part I: Adventure

https://technicshistory.com/2025/12/13/the-rise-of-computer-games-part-i-adventure/
1•cfmcdonald•51m ago•0 comments

Texas Space Boom Requires Lots of Lawyers in Boost for Firms

https://news.bloomberglaw.com/business-and-practice/texas-space-boom-requires-lots-of-lawyers-in-...
2•mooreds•56m ago•0 comments

Microsoft Excel Conquered Corporate America

https://www.bloomberg.com/news/articles/2025-12-04/how-microsoft-excel-is-navigating-ai-new-compe...
2•mooreds•56m ago•1 comments