frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Nifty Gateway, One of the oldest NFT trading platform shuts down

https://www.coindesk.com/business/2026/01/24/one-of-the-oldest-nft-trading-platform-which-facilit...
1•thm•2m ago•0 comments

Accept_language 2.2 – RFC 7231/4647 compliant Accept-Language parsing for Ruby

https://github.com/cyril/accept_language.rb
1•cyrilllllll•2m ago•0 comments

Leader of the Snack: How Mars became a food behemoth

https://www.rte.ie/news/business/2026/0125/1554807-mars-bars-chocolate/
1•austinallegro•2m ago•0 comments

How would you approach long term data acquisition from real estate platforms?

1•ashi-sal•2m ago•0 comments

My Claude Code Psychosis – By Jasmine Sun

https://jasmi.news/p/claude-code
1•doppp•3m ago•0 comments

The Uncomfortable Truths About Immigration

https://alexanderkustov.substack.com/p/the-uncomfortable-truths-about-immigration
1•barry-cotter•3m ago•0 comments

Video contradicts Trump's claim man killed in Minneapolis was a 'gunman'

https://www.theguardian.com/us-news/2026/jan/24/minneapolis-shooting-ice
2•gizzlon•6m ago•0 comments

From –$20k to $400k in a year. My LLM options trading experiment

https://scriptedalchemy.medium.com/from-20k-to-400k-in-a-year-my-llm-options-trading-experiment-1...
2•gionn•10m ago•0 comments

Jurassic Park - Tablet device on Nedry's desk? (2012)

https://www.therpf.com/forums/threads/jurassic-park-tablet-device-on-nedrys-desk.169883/
1•exvi•13m ago•0 comments

Raise Animals – Unblocked Free Game, Roblox Game Guide, Codes

https://raiseanimals.run/
1•mumuchen•16m ago•0 comments

Past, present and future perspectives on the science of aging

https://www.nature.com/articles/s43587-025-01046-2
1•XzetaU8•17m ago•0 comments

Taming the Agents: My "Spec-Test-Lint" Workflow for AI Coding

https://adlrocha.substack.com/p/adlrocha-taming-the-agents-my-spec
1•adlrocha•18m ago•0 comments

I Made a MIT Licensed Mecrisp-Stellaris Language Server

https://mecrisp-stellaris-folkdoc.sourceforge.io/mecrisp-stellaris-lsp.html
1•oldguy101•27m ago•0 comments

This paper has been cited more than 6k times. It's fatally flawed.

https://statmodeling.stat.columbia.edu/2026/01/22/aking/
2•timr•31m ago•0 comments

Goose-friendly MCP server for conducting I Ching divinations

https://github.com/threemachines/i-ching
1•barrenko•32m ago•0 comments

World Models

https://ankitmaloo.com/world-models/
1•ankit219•32m ago•0 comments

Latest ChatGPT model uses Elon Musk's Grokipedia as source, tests reveal

https://www.theguardian.com/technology/2026/jan/24/latest-chatgpt-model-uses-elon-musks-grokipedi...
4•guilamu•34m ago•2 comments

The Podcaster Poking at France's Biggest Secrets

https://www.nytimes.com/2026/01/25/world/europe/philippe-collin-france-podcast-history-world-war-...
2•mikhael•37m ago•0 comments

German economists push for gold repatriation from U.S. vaults

https://seekingalpha.com/news/4542254-german-economists-push-for-gold-repatriation-from-us-vaults
2•saubeidl•38m ago•1 comments

Rack – A local data stack operated with Claude Code

https://github.com/tylerdiaz/rack
1•tylerdiaz•39m ago•0 comments

Clawdbot Showed Me What the Future of Personal AI Assistants Looks Like

https://www.macstories.net/stories/clawdbot-showed-me-what-the-future-of-personal-ai-assistants-l...
2•thoughtpeddler•43m ago•0 comments

The coming war on Car Ownership

https://geohot.github.io//blog/jekyll/update/2026/01/25/war-on-car-ownership.html
32•tea_drinker•45m ago•24 comments

Show HN: HouseTrak – everything about your home in one app

https://housetrak.app
1•tas-blacktorch•45m ago•0 comments

Bandcamp becomes the first major music platform to ban AI content

https://www.theverge.com/news/861794/bandcamp-ban-ai-music
1•01-_-•47m ago•0 comments

Microsoft gave customers' BitLocker encryption keys to the FBI

https://www.tomshardware.com/tech-industry/cyber-security/microsoft-gave-customers-bitlocker-encr...
1•01-_-•48m ago•0 comments

I built a 2x faster lexer, then discovered I/O was the real bottleneck

https://modulovalue.com/blog/syscall-overhead-tar-gz-io-performance/
2•p4bl0•48m ago•1 comments

Show HN: MonsterWriter – An Overleaf Alternative with a Better Free Plan [video]

https://www.youtube.com/watch?v=feWZByHoViw
1•WolfOliver•48m ago•0 comments

Qwen3-TTS: Ultra-Low Latency (97ms), Voice Cloning and OpenAI-Compatible API

https://github.com/QwenLM/Qwen3-TTS
1•thunderbong•49m ago•0 comments

Conditional Privilege Escalation Synology DSM 7.3.2

https://thecontractor.io/synology-dsm-7-3-2/
2•splintersio•54m ago•0 comments

Sodebo Ultim 3 Smashes Jules Verne Trophy Record

https://www.sail-world.com/news/293210/Sodebo-Ultim-3-smashes-Jules-Verne-Trophy-Record
1•tonfa•55m ago•0 comments