news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

GPT 5.5 aces 20x20 multiplication that o3 couldn't handle

https://twitter.com/cozyblaze265065/status/2057739317649588558

1•marojejian•53m ago

Comments

marojejian•53m ago

Tremendous progress in a year.

While these foundation models aren't trying to be calculators, this kind of test previously provided a decent benchmark on their ability to scale composing iterative reasoning steps, and showed they were not that good at it.

At this point I'm tempted to conclude they are pretty good at it, since I don't see how such long calculations could really be considered "in distribution" from training or "memorized," except in the sense the model learned the algorithm correctly.

I still have doubts about how good present the present architecture & training is at learning to "generalize" effectively. e.g. see ARC3

But you can go a very long way, by memorizing everything, being able to compose steps well, being able to try many times, and being able verify as well as a human, even if you aren't so efficient in your "fluid intelligence."

The fraction of human cognition operating today that can be handled with that current approach seems pretty large.

Bender•49m ago

Forgive my ignorance here but why does a language learning model need to perform math at all rather than detecting math and handing it off to something optimized and trusted for everything from the most basic to the most advanced math that say mathematicians, CPA's and other professionals that depend on math would trust? Perhaps even create a short lived ephemeral link to the parsed input, interpretation and output of the math program showing it's work as proof that could be pasted into engineering and legal documents. Is this like code golf?

YouTube to begin automatically labeling AI videos

https://arstechnica.com/google/2026/05/youtube-to-begin-automatically-labeling-ai-videos/

2•tartoran•2m ago•0 comments

Roku First Major Home Screen Revamp in Years Will Open Up More Ad Opportunities

https://www.bloomberg.com/news/articles/2026-05-27/roku-s-first-major-home-screen-revamp-in-years...

2•1vuio0pswjnm7•3m ago•0 comments

Ask HN: Do coding agents need cross-tool org knowledge? Or, just good to have?

2•srbsa•4m ago•0 comments

Valve raises Steam Deck prices by more than $200

https://www.theverge.com/games/938340/valve-steam-deck-price-increase

4•droidjj•6m ago•0 comments

Musk says US Military suicide drones used Starlink in violation of SpaceX rules

https://arstechnica.com/tech-policy/2026/05/musk-says-us-military-suicide-drones-used-starlink-in...

2•ForHackernews•6m ago•0 comments

SIRT6 overexpression counteracts chromatin aging in the male murine liver

https://www.nature.com/articles/s41467-026-73115-y

2•wslh•6m ago•0 comments

LoongForge-A high-performance training framework for LLM, VLM, DIT, VLA models

https://github.com/baidu-baige/LoongForge

2•mindzzz•6m ago•0 comments

Iran's Internet is partially restored, Cloudflare Radar data shows

https://blog.cloudflare.com/iran-internet-partially-restored-may-2026/

2•jgrahamc•7m ago•0 comments

Building a Fast Lock-Free Queue in Modern C++ from Scratch

https://jaysmito.dev/blog/blog/04-fast-lockfree-queues/

1•ibobev•8m ago•0 comments

A Year Late, Claude Beats Pokémon

https://www.lesswrong.com/posts/sehJYg5Yny9fvpbpt/a-year-late-claude-finally-beats-pokemon

1•szatkus•10m ago•0 comments

The rise and fall of the only female Yakuza

https://www.theguardian.com/news/2026/may/21/the-devils-child-the-rise-and-fall-of-the-only-femal...

1•NaOH•12m ago•0 comments

Why frontier biology labs need Lisp-like infrastructure

https://www.countifybio.com/

1•mfisc_019•13m ago•0 comments

Some of Texas's oldest barbecue joints close as meat prices skyrocket

https://www.washingtonpost.com/nation/2026/05/25/some-texass-oldest-barbecue-joints-close-meat-pr...

2•paulpauper•13m ago•0 comments

Steam Deck OLED is back in stock, with a price increase for both models

https://store.steampowered.com/news/group/45479024/view/672869045073085538

3•no_news_is•14m ago•1 comments

Agents Thinking Fast and Slow: A Talker-Reasoner Architecture

https://arxiv.org/abs/2410.08328

3•jalcazar•15m ago•0 comments

The AI tech job slaughter gets real

https://www.computerworld.com/article/4175956/the-ai-tech-job-slaughter-gets-real.html

1•CrankyBear•15m ago•0 comments

Remarks on the Disproof of the Unit Distance Conjecture [pdf]

https://cdn.openai.com/pdf/74c24085-19b0-4534-9c90-465b8e29ad73/unit-distance-remarks.pdf

1•digital55•17m ago•0 comments

Datasette: An open source multi-tool for exploring and publishing data

https://datasette.io/

2•Olshansky•18m ago•0 comments

AgentSafeLabs – Launched Open-source Security framework for AI agents

https://github.com/AgentSafeLabs/safelabs-eval

1•waqarjaved•18m ago•0 comments

QuestDB 9.4.0

https://github.com/questdb/questdb/releases/tag/9.4.0

1•tosh•20m ago•0 comments

RTMH: Pope Leo's Magnifica Humanitas on AI

https://thezvi.substack.com/p/rtmh-pope-leos-magnifica-humanitas

2•paulpauper•20m ago•0 comments

Use AI This Election

https://www.astralcodexten.com/p/use-ai-this-election

3•paulpauper•20m ago•0 comments

Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction

https://arxiv.org/abs/2605.21779

3•root-parent•22m ago•0 comments

Verilog: Back to the building blocks' building blocks

https://www.cs.cornell.edu/~asampson/blog/buildingblocks.html

1•fanf2•22m ago•0 comments

One repo clone, shared forever

https://falconer.com/notes/persistent-repos-s3-files/

2•aryamanagraw•23m ago•0 comments

Benford's Law

https://en.wikipedia.org/wiki/Benford%27s_law

3•jonbaer•27m ago•0 comments

SimCity 3k in 4k

https://www.thran.uk/writ/hdid/2025/12/simcity-3k-in-4k.html

30•speckx•28m ago•1 comments

To Land a Job in AI, Try Reading Kant

https://www.wired.com/story/to-land-a-job-in-ai-try-reading-kant/

2•CharlesW•29m ago•0 comments

Repoprompt is going Open Source

https://repoprompt.com/blog/repo-prompt-next-chapter

1•mirzap•29m ago•0 comments

One Million Beings

https://sub.davidoreilly.com/p/one-million-beings

2•m3at•29m ago•1 comments