frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

I built Tinder for movies with Reels-style trailers

https://seenwant.com/en
1•tinolyonne•2m ago•1 comments

What type of code should you generate with AI?

1•hopa•2m ago•0 comments

The AI Guy [Google_made_an_Ooopsie]

https://inv.nadeko.net/channel/UCUYWQzo6AtlFSTHab_qKQaA/community
1•rolph•3m ago•0 comments

Show HN: Discover what could screw you in a contract

https://www.beforeyousign.lol/
1•roozka10•5m ago•0 comments

GDP.pdf: A Benchmark for Parsing PDFs

https://surgehq.ai/blog/gdp-pdf-can-100b-ai-models-master-the-documents-that-run-the-world
1•Olshansky•5m ago•0 comments

The Turkey Problem with OpenClaw

https://yakko.dev/blog/the-openclaw-turkey-problem
1•yakkomajuri•7m ago•0 comments

Show HN: I made a simple tip calculator

https://www.mytipcalculator.net/
1•atharvtathe•8m ago•0 comments

Team9 Review: The Fastest AI Workspace for Small Teams

https://team9.ai
1•ShawnaWang•11m ago•0 comments

Trump's Final Term Ends in 999 Days (Countdown)

https://logwork.com/countdown-h5o4
2•donbox•14m ago•0 comments

Anthropic: Project Deal

https://www.anthropic.com/features/project-deal
3•nopinsight•18m ago•1 comments

Parry Parries Again: Reanimating the Famous Paranoid Chatbot (In a Day)

https://sites.google.com/view/elizagen-org/blog/parry-parries-again
1•abrax3141•20m ago•1 comments

Legislature weighs law change to give Alaskans 'right to repair' electronics

https://www.adn.com/politics/alaska-legislature/2026/04/26/legislature-weighs-law-change-to-give-...
2•rolph•22m ago•0 comments

Bedrock Linux: A Meta-Distribution for Combining Parts of Other Distros

https://bedrocklinux.org/
2•ffin•28m ago•0 comments

GNAT: The GNU Ada Compiler (2004) [pdf]

https://www.adacore.com/uploads/books/gnat-book.pdf
2•csb6•28m ago•0 comments

Munich Tram Cars (1876 – 2026)

https://www.mvg.de/news/150-jahre-tram/tramtypen.html
2•nyell•29m ago•0 comments

I guess it's probably the best time of year for a protest

https://codyellingham.substack.com/p/a-revolution-in-4k
1•cody_ellingham•31m ago•1 comments

Show HN: Intent Bus – SQLite job bus for coordinating scripts across devices

https://github.com/dsecurity49/Intent-Bus
1•dsecurity49•31m ago•0 comments

John Rawls and the Death of Western Marxism

https://josephheath.substack.com/p/john-rawls-and-the-death-of-western
2•juleiie•32m ago•0 comments

The Linux Kernel Tree About to Hit 40M Lines

https://www.phoronix.com/news/Linux-Kernel-Nearly-40M
1•speckx•33m ago•0 comments

The Prompt API

https://developer.chrome.com/docs/ai/prompt-api
2•gslin•33m ago•0 comments

Agentic Workforce Framework, an operating model for autonomous agent teams

https://github.com/rayyagari2-create/agentic-workforce-framework
2•rayyagari•42m ago•0 comments

Notepad++ for Mac

https://notepad-plus-plus-mac.org/
10•jonbaer•46m ago•3 comments

Show HN: Friendly prediction markets to turn trips into a running tournament

https://bets.bernikins.com/
3•k0rm•46m ago•0 comments

Kenya's Sabastian Sawe is first person to run sub-2-hour marathon

https://www.npr.org/2026/04/26/nx-s1-5800057/kenya-sabastian-sawe-first-person-2-hour-marathon-lo...
3•ejp•46m ago•0 comments

EvanFlow – A TDD driven feedback loop for Claude Code

https://github.com/evanklem/evanflow
4•evanklem2004•56m ago•0 comments

TurboQuant: A First-Principles Walkthrough

https://arkaung.github.io/interactive-turboquant/
3•kweezar•58m ago•0 comments

Language Anchoring: A Systematic Method for LLM Multilingual Adaptation

https://github.com/fkyah3/opencode-fkyah3
1•fkyah3•1h ago•0 comments

Smolwebifying My Site

https://akselmo.dev/posts/smolwebifying-my-site/
2•vinipolicena•1h ago•0 comments

Internet Graveyard

https://internetgraveyard.vercel.app/
3•thebigship•1h ago•0 comments

A better Kubernetes, from the ground up (2020)

https://blog.dave.tf/post/new-kubernetes/
3•Wingy•1h ago•0 comments