frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: TetrisBench – AI vs. AI vs. Human Tetris using realtime code generation

https://tetrisbench.com/
1•ykhli•1h ago
Hi HN,

I built TetrisBench, a benchmark that tests LLMs on real-time code generation and reasoning through Tetris.

Live: https://tetrisbench.com/

*How it works:*

Each model starts with an initial optimization function for evaluating Tetris moves.

As the game progresses, the model sees the current board state and updates its algorithm—adapting its strategy based on how the game is evolving.

The model continuously refines its optimizer: - Board getting too high? Prioritize clearing lines. - Hole forming? Adjust penalties. - Safe stack? Build for a Tetris.

The model generates updated code, executes it to score all placements, and picks the best move.

*Current standings:*

| Model | Win Rate | |-------|----------| | Opus 4.5 | 68% | | GPT-5.2 | 63% | | Grok 4.1 | 22% |

(181 games so far, running more)

*Try it yourself:*

You can also play against any model directly. See if you can beat opus at Tetris—only 1 human has so far.

*All trajectories are logged.* Every game saves board states, the code each model generated, and placement decisions. Happy to share the dataset

Exploring Crystal macros: Building file-based routing for Kemal

https://krthr.co/exploring-crystal-macros-building-file-based-routing-for-kemal/
1•krthr•1m ago•0 comments

Ask HN: How locked down are your work machines?

1•donatj•2m ago•0 comments

KTree – Kubernetes Browser

https://github.com/amartiniuc/ktree/blob/main/README.md
2•amartiniuc•3m ago•1 comments

Using the BusyBox trick to turn AI prompts into "native" executables

https://tgalal.com/blog/genai-prompts-as-native-programs
1•tgalal•4m ago•0 comments

A 23-year-old's $1.5B AI hedge fund shows how prophecy turns profits

https://fortune.com/2025/10/09/as-ai-bubble-warnings-mount-a-23-year-olds-1-5-billion-hedge-fund-...
1•paulpauper•4m ago•0 comments

Quadratic Bezier – Distance 2D

https://www.shadertoy.com/view/MlKcDD
1•coffeeaddict1•4m ago•0 comments

funding.json

https://fundingjson.org/
1•jruohonen•5m ago•0 comments

Diplomacy by WhatsApp

https://www.newcartographies.com/p/diplomacy-by-whatsapp
1•treadump•6m ago•0 comments

Tesla cuts 1,700 jobs at Gigafactory Berlin despite denying it

https://electrek.co/2026/01/21/tesla-quietly-cuts-1700-jobs-at-gigafactory-berlin-despite-denying...
4•toomuchtodo•6m ago•0 comments

What AI Accountability Looks Like (I Built It)

https://forgeforward.substack.com/p/what-ai-accountability-actually-looks
1•forgeforward•6m ago•1 comments

Gayvn

https://www.facebook.com/events/virgin-hotels-las-vegas/live-2026-gayvn-awards-livestream-full-sh...
1•notgoodme•8m ago•0 comments

Show HN: CausaNova – Deterministic runtime for LLM constraints via Ontology

https://petzi2311.github.io/
1•CausaNova•8m ago•1 comments

Howard Lutnick: Why the Trump administration is going to Davos

https://www.ft.com/content/a675b8af-46b7-4f93-a616-41f0a002c22e
1•macleginn•10m ago•0 comments

Drug Laws Have Prevented Scientists from Studying Mushrooms

https://thesporereport.com/?p=606
2•speckx•10m ago•0 comments

Bitwarden launches enhanced premium plan

https://bitwarden.com/blog/bitwarden-launches-enhanced-premium-plan/
1•brycewray•11m ago•0 comments

Devin Review: AI to Stop Slop

https://cognition.ai/blog/devin-review#the-birth-and-stagnation-of-code-review
1•swyx•11m ago•0 comments

Microsoft CEO warns AI must 'do something useful' or lose 'social permission'

https://www.pcgamer.com/software/ai/microsoft-ceo-warns-that-we-must-do-something-useful-with-ai-...
2•akyuu•12m ago•0 comments

DOGE Employees Shared Social Security Data, Court Filing Shows

https://www.nytimes.com/2026/01/20/us/politics/doge-employees-social-security-data.html
3•pseudolus•13m ago•1 comments

Verizon carriers start switching to 365day device unlock policy, up from 60 days

https://9to5google.com/2026/01/20/verizon-device-unlock-policy-365-day/
1•thunderbong•13m ago•0 comments

More diversity means better science, says Nature journal chief

https://www.thetimes.com/uk/science/article/dei-diversity-better-science-nature-journal-boss-tgb7...
1•binning•14m ago•0 comments

How the NHS became the battleground in the trans debate facing workplaces

https://www.bbc.co.uk/news/articles/c7v0l25mr2ro
2•binning•18m ago•0 comments

Power, Consumption and Gender: An analysis of Barbara Kruger's political art

https://feminisminindia.com/2026/01/14/power-consumption-and-gender-an-analysis-of-barbara-kruger...
1•binning•19m ago•0 comments

Every big lab is putting resources in building world models

https://ankitmaloo.com/world-models/
1•ankit219•19m ago•0 comments

Show HN: Remember Me – O(1) Client-Side Memory (40x cheaper than Vector DBs)

https://github.com/merchantmoh-debug/Remember-Me-AI
1•MohskiBroskiAI•19m ago•0 comments

Manipulating blood CO₂ levels may help clear toxic proteins from the brain

https://medicalxpress.com/news/2026-01-blood-co8322-toxic-proteins-brain.html
1•bikenaga•19m ago•1 comments

480k-Year-Old Elephant Bone Tool Is the Oldest Ever Found Outside Africa

https://www.iflscience.com/this-480000-year-old-elephant-bone-tool-is-the-oldest-ever-found-outsi...
1•geox•22m ago•0 comments

How are you automating your coding work?

8•manthangupta109•23m ago•2 comments

Tracking Kernel Development with Korgalore

https://people.kernel.org/monsieuricon/tracking-kernel-development-with-korgalore
1•atomlib•23m ago•0 comments

Data Modeling: Living notes on levels, techniques, and patterns

https://www.ssp.sh/brain/data-modeling/
2•articsputnik•23m ago•0 comments

Doctors declare effects of child phone use a public health emergency

https://www.thetimes.com/uk/politics/article/phone-impact-on-children-is-public-health-emergency-...
2•chrisjj•24m ago•2 comments