news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Code only says what it does

https://brooker.co.za/blog/2020/06/23/code.html

1•logicprog•3m ago•0 comments

The success of 'natural language programming'

https://brooker.co.za/blog/2025/12/16/natural-language.html

1•logicprog•3m ago•0 comments

The Scriptovision Super Micro Script video titler is almost a home computer

http://oldvcr.blogspot.com/2026/02/the-scriptovision-super-micro-script.html

1•todsacerdoti•4m ago•0 comments

Discovering the "original" iPhone from 1995 [video]

https://www.youtube.com/watch?v=7cip9w-UxIc

1•fortran77•5m ago•0 comments

Psychometric Comparability of LLM-Based Digital Twins

https://arxiv.org/abs/2601.14264

1•PaulHoule•6m ago•0 comments

SidePop – track revenue, costs, and overall business health in one place

https://www.sidepop.io

1•ecaglar•9m ago•1 comments

The Other Markov's Inequality

https://www.ethanepperly.com/index.php/2026/01/16/the-other-markovs-inequality/

1•tzury•10m ago•0 comments

The Cascading Effects of Repackaged APIs [pdf]

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6055034

1•Tejas_dmg•12m ago•0 comments

Lightweight and extensible compatibility layer between dataframe libraries

https://narwhals-dev.github.io/narwhals/

1•kermatt•15m ago•0 comments

Haskell for all: Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding

2•RebelPotato•19m ago•0 comments

Dorsey's Block cutting up to 10% of staff

https://www.reuters.com/business/dorseys-block-cutting-up-10-staff-bloomberg-news-reports-2026-02...

1•dev_tty01•21m ago•0 comments

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]

https://www.youtube.com/watch?v=3SxNBz1VTE0

1•sanity•23m ago•1 comments

In the AI age, 'slow and steady' doesn't win

https://www.semafor.com/article/01/30/2026/in-the-ai-age-slow-and-steady-is-on-the-outs

1•mooreds•30m ago•1 comments

Administration won't let student deported to Honduras return

https://www.reuters.com/world/us/trump-administration-wont-let-student-deported-honduras-return-2...

1•petethomas•30m ago•0 comments

How were the NIST ECDSA curve parameters generated? (2023)

https://saweis.net/posts/nist-curve-seed-origins.html

2•mooreds•31m ago•0 comments

AI, networks and Mechanical Turks (2025)

https://www.ben-evans.com/benedictevans/2025/11/23/ai-networks-and-mechanical-turks

1•mooreds•31m ago•0 comments

Goto Considered Awesome [video]

https://www.youtube.com/watch?v=1UKVEUGEk6Y

1•linkdd•34m ago•0 comments

Show HN: I Built a Free AI LinkedIn Carousel Generator

https://carousel-ai.intellisell.ai/

1•troyethaniel•35m ago•0 comments

Implementing Auto Tiling with Just 5 Tiles

https://www.kyledunbar.dev/2026/02/05/Implementing-auto-tiling-with-just-5-tiles.html

1•todsacerdoti•36m ago•0 comments

Open Challange (Get all Universities involved

https://x.com/i/grok/share/3513b9001b8445e49e4795c93bcb1855

1•rwilliamspbgops•37m ago•0 comments

Apple Tried to Tamper Proof AirTag 2 Speakers – I Broke It [video]

https://www.youtube.com/watch?v=QLK6ixQpQsQ

2•gnabgib•39m ago•0 comments

Show HN: Isolating AI-generated code from human code | Vibe as a Code

https://www.npmjs.com/package/@gace/vaac

1•bstrama•40m ago•0 comments

Show HN: More beautiful and usable Hacker News

https://twitter.com/shivamhwp/status/2020125417995436090

3•shivamhwp•41m ago•0 comments

Toledo Derailment Rescue [video]

https://www.youtube.com/watch?v=wPHh5yHxkfU

1•samsolomon•43m ago•0 comments

War Department Cuts Ties with Harvard University

https://www.war.gov/News/News-Stories/Article/Article/4399812/war-department-cuts-ties-with-harva...

9•geox•46m ago•1 comments

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt

2•yi_wang•47m ago•0 comments

A Bid-Based NFT Advertising Grid

https://bidsabillion.com/

1•chainbuilder•51m ago•1 comments

AI readability score for your documentation

https://docsalot.dev/tools/docsagent-score

1•fazkan•58m ago•0 comments

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

https://science.nasa.gov/blogs/science-news/2026/02/06/nasa-study-non-biologic-processes-dont-ful...

3•bediger4000•1h ago•2 comments

I inhaled traffic fumes to find out where air pollution goes in my body

https://www.bbc.com/news/articles/c74w48d8epgo

2•dabinat•1h ago•0 comments

Open in hackernews

Paper AI Tigers

https://www.gleech.org/paper

3•mefengl•2mo ago

Comments

emschwartz•2mo ago

Very interesting! I especially appreciated the test of running models against the same benchmark from the following year and the point about the per-token discount being negated by models needing more tokens to get to the answer.

Generalization:

> Maybe Chinese models generalise to unseen tasks less well. (For instance, when tested on fresh data, 01’s Yi model fell 8pp (25%) on GSM - the biggest drop amongst all models.)

> We can get a dirty estimate of this by the “shrinkage gap”: look at how a model performs on next year’s iteration of some task, compared to this year’s. If it finished training in 2024, then it can’t have trained on the version released in 2025, so we get to see what they’re like on at least somewhat novel tasks. We’ll use two versions of the same benchmark to keep the difficulty roughly on par. Let’s try AIME:

> Almost all models get worse on this new benchmark, despite 2025 being the same difficulty as 2024 (for humans). But as I expected, Western models drop less: they lost 10% of their performance on the new data, while Chinese models dropped 21%. p = 0.09.

> Averaging across crappy models for the sake of a cultural generalisation doesn’t make sense. Luckily, rerunning the analysis with just the top models gives roughly the same result (9% gap instead of 11%).

Cost-effectiveness:

> Distinguish intelligence (max performance), intelligence per token (efficiency), and intelligence per dollar (cost-effectiveness).

> The 5x discounts I quoted are per-token, not per-success. If you had to use 6x more tokens to get the same quality, then there would be no real discount. And indeed DeepSeek and Qwen (see also anecdote here about Kimi, uncontested) are very hungry.