frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

LLMs: The gift that keeps on giving

https://mltrenches.substack.com/p/llms-the-gift-that-keeps-on-giving
1•druub•2m ago•0 comments

The Monocab Project

https://www.monocab-owl.de/english-language/
1•robin_reala•2m ago•0 comments

DevOps Manifesto: Against SiloOps and SoloOps – DevOps Is Software Engineering

https://alterloop.dev/manifesto/
1•francescobianco•3m ago•0 comments

30+ AI coding agents in the terminal, IDE, web

https://awesome-coding-ai.vercel.app/
1•ohong•4m ago•0 comments

Show HN: Preshiplist – A fast way to ship waitlist websites without headaches

https://www.preshiplist.co/
1•Frederick_22xAI•4m ago•0 comments

Counter Galois Onion: Improved encryption for Tor circuit traffic

https://blog.torproject.org/introducing-cgo/
1•fanf2•4m ago•0 comments

Judicial Malfeasance and Palestine Action

https://www.craigmurray.org.uk/archives/2025/11/judicial-malfeasance-and-palestine-action/
1•jjgreen•8m ago•0 comments

Security Flaws in DeepSeek-Generated Code Linked to Political Triggers

https://www.crowdstrike.com/en-us/blog/crowdstrike-researchers-identify-hidden-vulnerabilities-ai...
2•shalmanese•12m ago•0 comments

$9 author sues over 384M Indian Jones movie

https://www.google.com/search?q=+author+sues+over+384M+Indian+Jones+movie
2•asdefghyk•14m ago•1 comments

Model Context Protocol (MCP) Specification 2025-11-25

https://mcp.mintlify.app/specification/2025-11-25
1•somesnm•16m ago•0 comments

AI Legal system disruption with contract engineering

https://kyc.co/articles/vertical-markets-vanishing-lawyers-and-the-new-operating-system-of-commerce
1•steven555•20m ago•1 comments

I Reverse-Engineered Exa.ai Infrastructure Cost with Napkin Math

https://www.kshivendu.dev/blog/exa-napkin-math
1•kshivendu•22m ago•1 comments

Hurl 7.1.0, the Pretty Edition

https://hurl.dev/blog/2025/11/26/hurl-7.1.0-the-pretty-edition.html
2•jicea•23m ago•0 comments

Show HN: How to Quickly and Free Remove TikTok Video Watermarks

https://vdraw.ai/tiktok-watermark-remover
1•passioner•28m ago•0 comments

ArcOS v1.1 – A Natural-Language Cognitive Operating System

https://github.com/Takeshi-Sakamoto5/ArcOS-v1.1
1•takeshi_sakamo•32m ago•1 comments

Which Browser Should I Use In 2025

https://hackaday.com/2025/04/07/which-browser-should-i-use-in-2025/
2•heatherleelove•35m ago•0 comments

Health Care Systems

https://rodgercuddington.substack.com/p/healthcare-systems
2•freespirt•35m ago•1 comments

A New Blueprint: House of Leaves and AI

https://oxonianreview.com/articles/a-new-blueprint-house-of-leaves-and-ai
1•bryanrasmussen•36m ago•0 comments

Building Self-Hosting Rails Applications: Design Decisions and Why

https://sendbroadcast.net/blog/self-hosting-rails
1•amalinovic•38m ago•0 comments

Foreign tourists to pay extra fee to visit US national parks

https://www.bbc.com/news/articles/c1kpnxvpgy2o
1•mikhael•41m ago•1 comments

Benchmarking GPT-5.1 vs. Gemini 3.0 vs. Opus 4.5 across 3 Coding Tasks

https://blog.kilo.ai/p/benchmarking-gpt-51-vs-gemini-30-vs-opus-45
2•heymax054•46m ago•0 comments

Recent Performance and Administration Features in Firebird

https://www.ibphoenix.com/articles/art-00000602
1•mariuz•46m ago•0 comments

Show HN: Lifeline – Visual memory journal with emotion auras and AI companion

https://mylifelineapp.com/
1•Remi_Etien•48m ago•0 comments

Why Pricing Power Is the Most Important Economic Signal No One Tracks

https://capitalfolly.com/
2•d_e_solomon•56m ago•2 comments

Should R ecosystem be a choice for longer-term projects?

1•northlondoner•59m ago•0 comments

If you're building an AI product, interface is your primry competitive advantage

https://eleganthack.com/ux-is-your-moat-and-youre-ignoring-it/
2•kaizenb•1h ago•0 comments

Kastor – Build data pipelines visually

https://kastor-242087227970.us-west1.run.app/
1•Snidow•1h ago•1 comments

Statistical Process Control in Python

https://timothyfraser.com/sigma/statistical-process-control-in-python.html
10•lifeisstillgood•1h ago•0 comments

Show HN: SpacePigeon – Save and Restore macOS Workspaces

https://github.com/louivers/spacepigeon
1•kakmuis•1h ago•0 comments

It's Not Just You – The iOS Keyboard Is Broken

https://youtu.be/hksVvXONrIo
3•jmaker•1h ago•1 comments