frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Copilot Usage Report 2025

https://microsoft.ai/news/its-about-time-the-copilot-usage-report-2025/
1•samspenc•1m ago•0 comments

Human brains light up unexpectedly for chimp voices

https://elifesciences.org/reviewed-preprints/108795v1
1•stevenjgarner•1m ago•0 comments

Most of What You Read on the Internet Is Written by Insane People (2018)

https://old.reddit.com/r/slatestarcodex/comments/9rvroo/most_of_what_you_read_on_the_internet_is_...
1•sph•4m ago•0 comments

Cable channel subscribers grew in 8 years last quarter

https://arstechnica.com/gadgets/2025/12/cable-channel-subscribers-grew-for-the-first-time-in-year...
1•Bender•6m ago•0 comments

A new open AI coding model is closing in on proprietary options

https://arstechnica.com/ai/2025/12/mistral-bets-big-on-vibe-coding-with-new-autonomous-software-e...
1•Bender•6m ago•0 comments

After NPR and PBS defunding, FCC receives call to take away station licenses

https://arstechnica.com/tech-policy/2025/12/conservative-attacks-on-npr-and-pbs-continue-with-cal...
1•Bender•7m ago•0 comments

Towards an Implementation-Independent Interface for Semantic Web in Prolog [pdf]

https://github.com/Kiyoshi364/static-memory/blob/main/talks/Towards_an_Implementation-Independent...
1•triska•8m ago•0 comments

Songs of Syx is a fantasy city-builder with battles simulating

https://store.steampowered.com/app/1162750/Songs_of_Syx/
1•doener•8m ago•0 comments

How to Read a Book

https://en.wikipedia.org/wiki/How_to_Read_a_Book
1•tosh•8m ago•0 comments

Electron-phonon crystal interactions found quantized by a fundamental constant

https://phys.org/news/2025-12-electron-phonon-interactions-crystals-quantized.html
1•stevenjgarner•8m ago•1 comments

USA seizes oil tanker off Venezuelan coast

https://www.cnn.com/2025/12/10/politics/oil-tanker-seized-venezuela
1•1970-01-01•9m ago•1 comments

Show HN: CoverSEO – AI-powered keyword discovery using real SEO data

https://coverseo.com
1•drdruide•10m ago•0 comments

New Brain Maps Show How Chemical Tags Change and Link to Disease

https://www.nature.com/articles/s41593-025-02112-z
1•stevenjgarner•12m ago•1 comments

The future for women investors is in danger

https://www.fastcompany.com/91443693/women-investors-venture-capital-founders-fund-tech
1•simonebrunozzi•13m ago•1 comments

Rubio bans Calibri font at State Department for being too DEI

https://techcrunch.com/2025/12/10/marco-rubio-bans-calibri-font-at-state-department-for-being-too...
3•andrewstetsenko•13m ago•1 comments

Vibe Coding Is Good Enough

https://www.theregister.com/2025/12/10/vibe_coding_is_good_enough/
1•mpesce•18m ago•1 comments

Show HN: Real-time app-specific metrics via simple HTTP POST

1•nishimoo•21m ago•0 comments

Lessons learned from studying Fizzy test suite

https://testdrivingrails.com/blog/fizzy-test-suite
1•strzibny•22m ago•0 comments

U.S. may require Dutch travelers to share 5 yrs of social media to enter country

https://nltimes.nl/2025/12/10/us-may-require-dutch-travelers-share-5-years-social-media-enter-cou...
1•TechTechTech•22m ago•1 comments

Useful patterns for building HTML tools

https://simonwillison.net/2025/Dec/10/html-tools/
1•simonw•24m ago•0 comments

Google rolling out Android Emergency Live Video sharing

https://9to5google.com/2025/12/10/android-emergency-live-video/
1•methuselah_in•24m ago•0 comments

Why America Is Winning the Carbon Capture Race

https://oilprice.com/Energy/Energy-General/Why-America-Is-Winning-the-Carbon-Capture-Race.html
1•PaulHoule•25m ago•0 comments

DHH and Open Source

https://ma.tt/2025/12/dhh-open-source/
2•cratermoon•25m ago•1 comments

AI Turns the Firehose into a Funnel

https://www.niemanlab.org/2025/12/ai-turns-the-firehose-into-a-funnel/
2•speckx•26m ago•0 comments

I miss the old Qasar, not the new Qasar

https://qy.co/writings/newqasar/
1•stopachka•26m ago•0 comments

Streaming Comes into the Fold – IBM Confluent Acquisition Analysis

https://tomtunguz.com/ibm-confluent-acquisition-analysis/
1•nowflux•28m ago•1 comments

Campus Hook: a social directory for college students (2002)

https://www.scribd.com/document/964087828/Campus-Hook-business-plan
1•jlodwick•28m ago•1 comments

The Xonsh shell wrapped up 2024-2025 with impressive improvements

https://github.com/xonsh/xonsh
1•ananany•29m ago•1 comments

Meta shifts to closed 'Avocado' AI model trained on Alibaba's Qwen

https://www.perplexity.ai/discover/top/meta-shifts-to-closed-avocado-Yd5AUbWsQw.ACDZxeNEzOA
1•chickensong•30m ago•0 comments

Predictions for Journalism 2026

https://www.niemanlab.org/collection/predictions-2026/
1•ChrisArchitect•30m ago•0 comments