frontpage.

We wanted to test if a smaller model like GPT-4.1-mini could beat its bigger brother 4.1 at the game Tic-Tac-Toe using only context engineering.

We put them in a 100-game tournament. For the smaller model, we gave it a few examples of winning moves from past games right before it made its own move.

The results were clear. Without the examples, the smaller model struggled against GPT-4.1. With the examples, its effectiveness increased by nearly 200%, and it consistently won.

It's a simple demonstration, but it shows that a smaller, faster model with good, timely examples can outperform a more capable base model.

The full write up and code are in the repo.

1990 Networking: LAN Manager 2.0

Original Xbox Hacks: The A20 CPU Gate

Michael "The Grinder" Mizrachi Wins 2025 World Series of Poker Main Event

Watch videos in your preferred language

Show HN: ChainTok – Immortalize your love on Bitcoin's eternal ledger

Improving OSM lake polygons using Lidar data [video]

Photos: The Scale of China's Solar-Power Projects

Dreamflow: create flutter apps with text prompts

A Wide Reduction Trick

International Math Olympiad 2025 Problems: How Well Will AI Do?

I've been coding with AI for two years. Here is what I've learned

Links? Links – Infrequently Noted

Cheating? Or the acumen of modern programming? FOSS, "AI", and human conscience

LLM Benchmarking Shows Capabilities Doubling Every 7 Months

The Geological Sublime

Predicting Earthquakes

Garum Sardiniae in Tabula: Rediscovering the Ancient Taste of Roman Cuisine

Mercedes-Benz adds support for Teams app, Intune integration, and Copilot

Which Economic Tasks Are Performed with AI? Evidence from Claude Conversations

The internet keeps getting worse. Let's talk about why [video]

EurIPS: Present NeurIPS Papers in Europe

NASA won't publish key climate change report online, citing no legal obligation

Foreign YouTube stars secretly paid by UK Government for propaganda

Eight healthy babies born after IVF using DNA from three people

Show HN: Running Linux Inside Node.js

Show HN: Open-source business management tool for small business

Researchers announce babies born from a trial of three-person IVF

Ctfoigt

Show HN: Cobble – A hard daily word game

Scandal-Ridden Fyre Festival Is Sold for $245,000 on eBay

Show HN: We made GPT-4.1-mini beat 4.1 at Tic-Tac-Toe using dynamic context

Comments