frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Performance Improvements in Libffi

https://atgreen.github.io/repl-yell/posts/libffi-plan-cache/
1•atgreen•4m ago•0 comments

Town Square, the community deserves connection

https://cauenapier.com/blog/town-square-the-community-deserves-connection/
1•birdculture•6m ago•0 comments

Defending bot-detection code that runs on the attacker's own machine

https://trustsig.eu/blog/reverse-once-run-forever/
1•TrustSig•8m ago•0 comments

Is Trump-Netanyahu Rift on the Cards?

https://trump-netanyahu-rift.pagey.site/
1•freakynit•9m ago•0 comments

Crawling BitTorrent DHTs for Fun and Profit [pdf]

https://www.usenix.org/legacy/event/woot10/tech/full_papers/Wolchok.pdf
1•dgellow•10m ago•0 comments

Daily_stock_analysis: LLM-powered multi-market stock analysis system

https://github.com/ZhuLinsen/daily_stock_analysis
4•vantareed•12m ago•0 comments

Who Owns Your ATProto Identity? Hint: It's Probably Not You

https://kevinak.se/blog/who-actually-owns-your-atproto-identity-hint-its-probably-not-you
2•kevinak•13m ago•0 comments

Show HN: lpviz – Interactive linear programming visualization in the browser

https://lpviz.net/
5•klamike•13m ago•0 comments

8086 Segmented Memory was a good idea

https://owl.billpg.com/8086-segmented-memory-was-a-good-idea-almost/
2•billpg•15m ago•1 comments

Experiments in Sports Seismology for the World Cup

https://pnsn.org/blog/experiments-in-sports-seismology-for-the-world-cup
2•jmward01•16m ago•0 comments

Refloow Photo Studio – A local, offline photo editor with on-device AI

https://github.com/Refloow/Refloow-Photo-Studio
1•refloow•18m ago•0 comments

Ask HN: Part time developer reality check

1•tim_loaf-father•18m ago•0 comments

The New Cloud in the Stochastic CPU Era

https://twitter.com/yossieliaz/status/2068693960814424362
1•zozo-king•20m ago•0 comments

Kansas City's push for facial recognition on public buses sparks privacy debate

https://apnews.com/article/kansas-city-facial-recognition-ai-cameras-privacy-87847f57c94b6c2a9e22...
2•smurda•20m ago•0 comments

Online haters in the low-budget literary biz

https://statmodeling.stat.columbia.edu/2026/06/21/online-haters-in-the-low-budget-literary-biz/
1•Tomte•22m ago•0 comments

Prompt Caching: Just do it

https://kreidemann.com/blog/prompt-caching
2•kreidema•25m ago•0 comments

I'm done with LLM-through-chat-experience

https://www.thoughtfultechnologist.com/p/im-done-with-llm-through-chat-experience
2•nisabek•27m ago•0 comments

The Role of Carbon Capture and Storage in Decarbonizing U.S. Data Centers

https://pubs.acs.org/doi/10.1021/acs.energyfuels.6c01309
2•giuliomagnifico•28m ago•0 comments

AI Under Trump's Control: Can France Still Avoid Digital Dependence?

https://thenewassociationwebmasters.blogspot.com/2026/06/ai-under-american-control-can-france.html
8•laurentlof•31m ago•2 comments

Show HN: Stock analysis tool with quality scores and fundamental charting

https://intrinsiqq.com
2•FlippieFinance•33m ago•0 comments

Ask HN: Do you give AI coding agents their own GitHub account?

2•ahmd•33m ago•0 comments

Block rolls out Builderbot, a new suite of AI-native tools

https://block.xyz/inside/block-rolls-out-builderbot-a-new-suite-of-ai-native-tools-that-changes-t...
1•msolujic•34m ago•1 comments

ggplot2: Colour Scales and Legends

https://ggplot2-book.org/scales-colour.html
1•tosh•37m ago•0 comments

Real-time dreamy Cloudscapes with Volumetric Raymarching

https://blog.maximeheckel.com/posts/real-time-cloudscapes-with-volumetric-raymarching/
2•vortex_ape•38m ago•0 comments

Seems almost every possibly interesting title here is a NAG screen?

2•DivingForGold•38m ago•0 comments

Show HN: Cloak – let AI agents use your API keys without ever seeing them

https://github.com/cloakward/cloak
2•VarunMenon•41m ago•0 comments

Android developer verification: Building a safer ecosystem together

https://android-developers.googleblog.com/2026/06/android-developer-verification.html
1•ChrisArchitect•44m ago•0 comments

Switzerland built an alternative to BGP. Nobody noticed

https://www.theregister.com/on-prem/2026/03/17/switzerland-built-an-alternative-to-bgp-nobody-not...
2•1vuio0pswjnm7•45m ago•0 comments

You Have the Pieces. Now Build It

https://www.theidentityunderground.com/post/you-already-have-the-pieces-now-build-it
1•mooreds•48m ago•0 comments

An Autism Challenge

https://www.cremieux.xyz/p/an-autism-challenge
1•brandonb•48m ago•0 comments