frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

The Software Factory Age: Why 2026 May Be the End of Artisan Coding

https://argssh.substack.com/p/the-software-factory-age-why-2026
1•argssh•16s ago•0 comments

Ran into a fake "LP" at a YC after-party and I need to vent

1•danish00111•2m ago•0 comments

The e-graph data structure

https://www.cole-k.com/2023/07/24/e-graphs-primer/
2•SchwKatze•7m ago•0 comments

Show HN: 5-translation RAG matrix fixing LLM religious hallucinations

https://github.com/salaamalykum/quran-semantic-search
2•uk9854321•22m ago•0 comments

Surely no brand is more hated by web users that Cloudflare

5•chrisjj•23m ago•1 comments

Keycard – API keys scoped to one subprocess, gone when it exits

https://www.keycard.studio/zh/
3•jijane•25m ago•0 comments

Hermes Agent by Nous Research

https://hermes-agent.nousresearch.com
2•dnw•25m ago•0 comments

TensorRT LLM

https://github.com/NVIDIA/TensorRT-LLM
2•kristianpaul•31m ago•0 comments

A visual guide to Artemis II and previous missions to the moon

https://www.aljazeera.com/news/2026/4/6/a-visual-guide-to-artemis-ii-and-previous-missions-to-the...
1•mooreds•33m ago•0 comments

Passkeys are one of the worst consumer rollouts I ever witnessed

https://bsky.app/profile/jennschiffer.com/post/3mjrpkrqjm22a
4•mooreds•34m ago•1 comments

Bluetooth tracker hidden in postcard and mailed to warship exposed its location

https://www.tomshardware.com/tech-industry/cyber-security/bluetooth-tracker-hidden-in-a-postcard-...
2•thunderbong•40m ago•1 comments

Server builds, wallet signs – a non-custodial Web3 checkout pattern

https://blauenlabs.com/blog/web3-checkout-pattern/
1•thanders•40m ago•0 comments

Writing Liveness

https://contraptions.venkateshrao.com/p/writing-liveness
1•jger15•42m ago•0 comments

AI agent called every pub in Ireland to index the cost of a Guinness

https://guinndex.ai
2•bilekas•45m ago•0 comments

Show HN: Building compiler from scratch without the help of LLMs [video]

https://www.youtube.com/watch?v=THIkjQnqsbw
2•aarnphm•46m ago•0 comments

Global freedom declined for the 20th consecutive year in 2025

https://freedomhouse.org/report/freedom-world/2026/growing-shadow-autocracy
1•Cider9986•47m ago•0 comments

Show HN: Mac-computer-use, an open-source clone of Codex Computer Use

https://github.com/TheGuyWithoutH/mac-computer-use
3•guywithnoh•50m ago•0 comments

Colombia will euthanize Pablo Escobar's invasive 'cocaine hippos'

https://www.scientificamerican.com/article/colombia-will-euthanize-pablo-escobars-invasive-cocain...
1•zdw•52m ago•0 comments

Fatal Accident Occurs in Nurburgring Langstrecken-Serie (NLS)

https://www.bbc.com/sport/motorsport/articles/crl1wwdegkno
1•linzhangrun•54m ago•0 comments

Bounce Off the Atmosphere at Reentry? (2016)

https://space.stackexchange.com/questions/19296/bounce-off-the-atmosphere-at-reentry
1•susam•55m ago•0 comments

F-14 Central Air Data Computer

https://en.wikipedia.org/wiki/F-14_CADC
1•unsnap_biceps•56m ago•1 comments

Verkada Deceives School That Verkada Cameras Would Not "Brick"

https://ipvm.com/reports/verkada-school-brick
3•jhonovich•58m ago•2 comments

CTX is a cognitive memory layer for AI systems

https://github.com/diegoxtr/ctx-open
2•diegoxtr•59m ago•0 comments

Show HN: FluxTest for testing network performance of self-hosted infrastructure

https://github.com/siddheshgunjal/flux-test
1•siddheshgunjal•1h ago•0 comments

NASA selects Falcon Heavy to launch ESA Mars rover mission despite budget threat

https://spacenews.com/nasa-selects-falcon-heavy-to-launch-esa-mars-rover-mission-despite-budget-t...
1•bookmtn•1h ago•0 comments

Show HN: Hyprmark – Markdown viewer for the Hyprland ecosystem

https://github.com/robinduckett/hyprmark
1•robinduckett•1h ago•1 comments

A resume builder for pets, for those who need to prove their animal isn't feral

https://petresume.co/
2•droopyKnees•1h ago•1 comments

The Khan Ted Institute

https://khanted.org/Home
3•capex•1h ago•1 comments

Ask HN: API Request for Feedback OK in Show HN?

2•casefile_dev•1h ago•3 comments

Opentargets-py – Python SDK for the Open Targets drug discovery database

https://pypi.org/project/opentargets-py/
2•goknurarican•1h ago•0 comments