frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•7mo ago

Comments

tocs3•7mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Nobody knows how large software products work

https://www.seangoedecke.com/nobody-knows-how-software-products-work/
1•danielfalbo•1m ago•0 comments

Yes, Termux works just the same on Android-based "dumbphones"

https://old.reddit.com/r/termux/comments/1ptzzyz/in_case_anyones_curious_yes_termux_works_just_the/
1•sipofwater•2m ago•0 comments

Homophily

https://en.wikipedia.org/wiki/Homophily
1•nomilk•6m ago•0 comments

The lightest notes app implementation in 111 loc

https://github.com/antonmedv/textarea
1•birdculture•14m ago•0 comments

AutoCodeBench: Tencent Hunyuan revolutionizes AI programming evaluation

https://medium.com/@leivadiazjulio/autocodebench-how-tencent-hunyuan-revolutionizes-ai-programmin...
1•stareatgoats•15m ago•0 comments

Ultra-Wide Band: A Transformational Technology for the Internet of Things

https://www.eetimes.com/ultra-wide-band-a-transformational-technology-for-the-internet-of-things/
2•fzliu•16m ago•0 comments

Show HN: I built a system that locks you out until you rest

https://www.kensho.zone/
1•kenshozone•16m ago•0 comments

Consciousness May Require a New Kind of Computation FeaturedNeuroscience·

https://neurosciencenews.com/consciousness-computing-ai-30068/
1•_____k•21m ago•0 comments

C++: "We have try...finally at home"

https://devblogs.microsoft.com/oldnewthing/20251222-00/?p=111890
2•HeliumHydride•24m ago•0 comments

Research shows sharing of cavity-causing bacteria may not be only from mothers

https://www.uab.edu/news/health-medicine/research-shows-sharing-of-cavity-causing-bacteria-may-no...
1•thunderbong•25m ago•0 comments

Interactively visualize GitHub Actions Matrix configurations

https://katexochen.github.io/github-matrix-parser/
2•todsacerdoti•27m ago•0 comments

Ask HN: Does the WhatsApp Apple Watch app work for anyone at all?

1•wateralien•32m ago•0 comments

If it's not a hit, switch

https://sive.rs/hitswitch
1•alabhyajindal•32m ago•0 comments

The Painter's Art: Matte Paintings Up Close – Part One (2011)

http://nzpetesmatteshot.blogspot.com/2011/02/painters-art-mattes-up-close-part-one.html
1•exvi•37m ago•0 comments

Chicago Style Title Capitalization Tool

https://capitalizemytitle.com/style/Chicago/
1•exvi•40m ago•0 comments

The e-scooter isn't new – London was zooming around on Autopeds a century ago

https://www.ianvisits.co.uk/articles/the-e-scooter-isnt-new-london-was-zooming-around-on-autopeds...
1•zeristor•42m ago•0 comments

AgentOllama: Simple and Easy to Use UI Based Agentic System

https://github.com/ranjanprj/agentollama
1•ranjanprj•42m ago•1 comments

Show HN: Free True or False Quiz Maker

https://minform.io/tools/true-or-false-quiz-maker
1•eashish93•46m ago•0 comments

How to safely let LLMs query your databases via sandboxed materialized views

https://www.pylar.ai/blog/5-layer-architecture-connecting-agents-databases
1•Hoshang07•47m ago•1 comments

Shooter game but real money is on the line. Would you play?

1•MikeyLi•53m ago•0 comments

US bars 5 Europeans it says pressured tech firms to censor American viewpoints

https://wtop.com/europe/2025/12/us-bars-five-europeans-it-says-pressured-tech-firms-to-censor-ame...
5•pjmlp•54m ago•0 comments

Help my website is too small

https://lukeplant.me.uk/blog/posts/help-my-website-is-too-small/
7•truxs•59m ago•1 comments

The ML drug discovery startup trying hard to not cheat

https://www.owlposting.com/p/an-ml-drug-discovery-startup-trying
1•KnuthIsGod•59m ago•0 comments

ACX 2026 Prediction Contest

https://www.metaculus.com/tournament/ACX2026/
1•kqr•1h ago•0 comments

Xmas.js: A lightweight, high-performance TS/JS engine

https://github.com/LemonHX/Xmas.JS
1•sbt567•1h ago•0 comments

A Couple 3D AABB Tricks

https://gpfault.net/posts/aabb-tricks.html
1•nice_byte•1h ago•0 comments

Show HN: Create color palettes for design systems (primitive, semantic tokens)

https://www.kolors.dev
1•souhail_dev•1h ago•0 comments

Querying 160 GB of Parquet Files with DuckDB in 15 Minutes

https://datamethods.substack.com/p/querying-160-gb-of-parquet-files
4•zekrom•1h ago•0 comments

ByteDance Seed 1.6 – pretty solid

https://openrouter.ai/bytedance-seed/seed-1.6
1•Yash16•1h ago•0 comments

The Superposition of St. Nicholas

https://medium.com/luminasticity/the-superposition-of-st-nicholas-c722ae5eddba
2•bryanrasmussen•1h ago•0 comments