frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

GPT needs a truth-first toggle for technical workflows

1•PAdvisory•8mo ago
I use GPT-4 extensively for technical work: coding, debugging, modeling complex project logic. The biggest issue isn’t hallucination—it’s that the model prioritizes being helpful and polite over being accurate.

The default behavior feels like this:

Safety

Helpfulness

Tone

Truth

Consistency

In a development workflow, this is backwards. I’ve lost entire days chasing errors caused by GPT confidently guessing things it wasn’t sure about—folder structures, method syntax, async behaviors—just to “sound helpful.”

What’s needed is a toggle (UI or API) that:

Forces “I don’t know” when certainty is missing

Prevents speculative completions

Prioritizes truth over style, when safety isn’t at risk

Keeps all safety filters and tone alignment intact for other use cases

This wouldn’t affect casual users or conversational queries. It would let developers explicitly choose a mode where accuracy is more important than fluency.

This request has also been shared through OpenAI's support channels. Posting here to see if others have run into the same limitation or worked around it in a more reliable way than I have found

Comments

duxup•8mo ago
I’ve found this with many LLMs they want to give an answer, even if wrong.

Gemini on the Google search page constantly answers questions yes or no… and then the evidence it gives indicates the opposite of the answer.

I think the core issue is that in the end LLMs are just word math and they don’t “know” if they don’t “know”…. they just string words together and hope for the best.

PAdvisory•8mo ago
I went into it pretty in depth after breaking a few with severe constraints, what it seems to come down to is how the platforms themselves prioritize functions, MOST put "helpfulness" and "efficiency" ABOVE truth, which then leads the LLM to make a lot of "guesses" and "predictions". At their core pretty much ALL LLM's are made to "predict" the information in answers, but they CAN actually avoid that and remain consistent when heavily constrained. The issue is that it isn't at the core level, so we have to CONSTANTLY retrain it over and over I find
Ace__•8mo ago
I have made something that addresses this. Not ready to share it yet, but soon-ish. At the moment it only works on GPT model 4o. I tried local Q4 KM's models, on LM Studio, but complete no go.

Nineteen Septillion Addresses – Setting Up an ASN, Obtaining IP Addresses

https://alastairbarber.com/Setting-Up-ASN-IPv6-Routing-BIRD-Teltonika-Router-Wireguard/
1•zhouzhao•2m ago•0 comments

I'm 17, built shipfree because every SaaS boilerplate I tried was trash

https://shipfree.revoks.dev
1•rutagandasalim•3m ago•1 comments

Stargate Community

https://openai.com/index/stargate-community/
1•tosh•3m ago•0 comments

Merge-pdf.app – A free, privacy-first PDF Merging tool

https://ryansouthgate.com/merge-pdfs/
1•ry8806•5m ago•1 comments

Evaluating and creating dithering algorithms for epaper laptop

https://peterme.net/building-an-epaper-laptop-dithering.html
1•PMunch•5m ago•0 comments

External AI Representations and Evidentiary Reconstructability

https://www.aivojournal.org/external-ai-representations-and-evidentiary-reconstructability/
1•businessmate•7m ago•1 comments

Mark Carney's Full Speech at the World Economic Forum

https://www.youtube.com/watch?v=btqHDhO4h10
2•doener•9m ago•0 comments

Ask HN: How does PagerDuty's site still not have a dark mode?

1•pants2•11m ago•0 comments

Design System Maturity Model

https://infa.ai/learn/maturity
1•handfuloflight•12m ago•0 comments

Remove repo selector from charts, usage, and PRs pages

1•nishiohiroshi•19m ago•0 comments

I built a tool that forces 5 AI to debate and cross-check facts before answering

https://github.com/KeaBase/kea-research
2•Stanislaw_•21m ago•0 comments

My 2025 Bug Bounty Stories

https://joshua.hu/2025-bug-bounty-stories-fail
2•karel-3d•24m ago•1 comments

BullSheet – My "Local" Quantitative Finance Engine

https://bayramovanar.substack.com/p/why-i-built-bullsheet-part-1
1•Bayramovanar•26m ago•1 comments

A 1990s CMS That Still Ships: Exponential CMS Reaches PHP 8.5

https://vincentopar.com/
1•Vincent_Opar•27m ago•0 comments

DOGE staffers at Social Security agency may have violated Hatch Act, DOJ says

https://abcnews.go.com/US/2-doge-staffers-social-security-agency-violated-hatch/story?id=129393252
1•vaxman•27m ago•1 comments

Are 'toxic' personality traits useful test cases for AI or behavioral models?

https://github.com/FlDanyT/ai-celebrity-models
1•yakalmar2048•30m ago•1 comments

LiveContainer: Run iOS apps without installing them

https://github.com/LiveContainer/LiveContainer
2•handfuloflight•30m ago•0 comments

DragonSweeper: A minesweeper game that requires observation

https://dragonsweeper.org
1•wslh•31m ago•0 comments

WebRTC VPN Tunnel

https://github.com/Manav1011/webrtc-vpn
1•walterbell•32m ago•0 comments

DiffRatio – A One-Step Diffusion Model with SOTA quality and 50% less memory

https://www.arxiv.org/pdf/2502.08005
2•LoMoGan•34m ago•1 comments

The Issue with Special Issues: When Guest Editors Publish in Support of Self

https://arxiv.org/abs/2601.07563
1•wslh•34m ago•0 comments

Amazon Joins the Big-Box League with Its Largest-Ever Store

https://www.wsj.com/business/retail/amazon-orland-park-illinois-opening-13362c97
1•divbzero•39m ago•0 comments

When I Talk to AI About My Feelings, I Don't Want a Therapy Ad

https://www.theverge.com/news/864103/mixed-messaging
1•thor1122•40m ago•0 comments

Green vs. Blue

https://greenvblue.npeercy.com/
1•greenwallnorway•45m ago•0 comments

Sony to Transfer Home Entertainment Operations to Tcl-Led Joint Venture

https://xthe.com/news/sony-tv-business-tcl/
1•Sandhyaseo•46m ago•1 comments

Negotiating Relationships with ChatGPT

https://arxiv.org/abs/2601.13188
2•7777777phil•46m ago•0 comments

Why Submit to AI in Production: Speaking as a Tool for Better Work

https://www.r-bloggers.com/2026/01/why-submit-to-ai-in-production-speaking-as-a-tool-for-better-w...
1•7777777phil•49m ago•0 comments

Crates.io: Development Update

https://blog.rust-lang.org/2026/01/21/crates-io-development-update/
5•quapster•50m ago•0 comments

AT&T Archives: The Unix Operating System (1972) [video]

https://www.youtube.com/watch?v=tc4ROCJYbm0
1•vismit2000•51m ago•0 comments

Agentic RAG for Dummies

https://github.com/GiovanniPasq/agentic-rag-for-dummies
2•thunderbong•52m ago•0 comments