frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
1•FinnLobsien•1m ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•2m ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•2m ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
1•basilikum•5m ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•6m ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•10m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
2•throwaw12•12m ago•1 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•12m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•13m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•15m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•18m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•20m ago•0 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
2•mgh2•26m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•28m ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•33m ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•35m ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•35m ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•38m ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•39m ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•41m ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•43m ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•45m ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•47m ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•50m ago•1 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
1•__natty__•50m ago•0 comments

Show HN: Android-based audio player for seniors – Homer Audio Player

https://homeraudioplayer.app
3•cinusek•51m ago•2 comments

Starter Template for Ory Kratos

https://github.com/Samuelk0nrad/docker-ory
1•samuel_0xK•52m ago•0 comments

LLMs are powerful, but enterprises are deterministic by nature

2•prateekdalal•56m ago•0 comments

Make your iPad 3 a touchscreen for your computer

https://github.com/lemonjesus/ipad-touch-screen
2•0y•1h ago•1 comments

Internationalization and Localization in the Age of Agents

https://myblog.ru/internationalization-and-localization-in-the-age-of-agents
1•xenator•1h ago•0 comments
Open in hackernews

Show HN: FLE v0.3 – Claude Code Plays Factorio

https://jackhopkins.github.io/factorio-learning-environment/versions/0.3.0.html
75•noddybear•4mo ago
We're excited to release v0.3.0 of the Factorio Learning Environment (FLE), an open-source environment for evaluating AI agents on long-horizon planning, spatial reasoning, and automation tasks.

== What is FLE? ==

FLE uses the game Factorio to test whether AI can handle complex, open-ended engineering challenges. Agents write Python code to build automated factories, progressing from simple resource extraction (~30 units/min) to sophisticated production chains (millions of units/sec).

== What's new in 0.3.0 ==

- Headless scaling: No longer needs the game client, enabling massive parallelization!

- OpenAI Gym compatibility: Standard interface for RL research

- Claude Code integration: We're livestreaming Claude playing Factorio [on Twitch](http://twitch.tv/playsfactorio)

- Better tooling and SDK: 1-line CLI commands to run evaluations (with W&B logging)

== Key findings ==

We evaluated frontier models (Claude Opus 4.1, GPT-5, Gemini 2.5 Pro, Grok 4) on 24 production automation tasks of increasing complexity.

Even the best models struggle:

- Most models still rely on semi-manual strategies rather than true automation

- Agents rarely define helper functions or abstractions, limiting their ability to scale

- Error recovery remains difficult – agents often get stuck in repetitive failure loops

The performance gap between models on FLE correlates more closely with real-world task benchmarks (like GDPVal) than with traditional coding/reasoning evals.

== Why this matters ==

Unlike benchmarks based on exams that saturate quickly, Factorio's exponential complexity scaling means there's effectively no performance ceiling. The skills needed - system debugging, constraint satisfaction, logistics optimization - transfer directly to real challenges.

== Try it yourself ==

>>> uv add factorio-learning-environment

>>> uv add "factorio-learning-environment[eval]"

>>> fle cluster start

>>> fle eval --config configs/gym_run_config.json

We're looking for researchers, engineers, and modders interested in pushing the boundaries of agent capabilities. Join our Discord if you want to contribute. We look forward to meeting you and seeing what you can build!

-- FLE Team

Comments

bottydim•4mo ago
haha, I am sure somewhere, some PhD student told their supervisor: “No, seriously, I have to play 600 hours of Factorio… for science.”
georgeh4cks•4mo ago
Loving the ‘Claude plays’ integration. Great work
noddybear•4mo ago
Thank you!
dang•4mo ago
Related. Others?

Multi-Agent Coordination in Factorio: FLE v0.2.0 - https://news.ycombinator.com/item?id=43926829 - May 2025 (5 comments)

Show HN: Factorio Learning Environment – Agents Build Factories - https://news.ycombinator.com/item?id=43331582 - March 2025 (209 comments)

noddybear•4mo ago
This is our earlier work. Since May we've made it really easy for the community to build their own agents to play the game: you can now hook up your terminal to get Claude Code to play the game.
dang•4mo ago
That's great!

(just for clarity: links to past threads in no way imply that the new post isn't welcome! They're just because some readers enjoy poking back through past related discussions as well)

typpilol•4mo ago
Is there going to be some kind of plugin support for other games?

Id love to see Claude playa age of empires.

Claude plays command and conquer.

I already know there a huge AI starcraft 2 scene, but I don't think those are LLM AI.

noddybear•4mo ago
I am really keen on plugging into Age of Empires 2 - although practically I think we need a couple of years of improvements before LLMs would be smart/fast enough to react to the game in realtime. Currently they can't react fast enough - although specially trained networks could be viable.
yeasku•4mo ago
Open AI tried to create a Dota 2 AI with reinforcement learning. Some of its best people worked on that.

They had to dumb down the game and keep the bot playing on the same patch, even then it could not win against a proffesional team.

Crespyl•4mo ago
I'm pretty sure that AI did take at least a few games off of the pros. IIRC the professional team only had one win, the last match.

I do agree that the game was terribly dumbed down to make it tractable. I keep hoping they'll revisit Dota 2 to see if they can find meaningful improvements and tackle the full game.

typpilol•4mo ago
The last time they deployed it... It beat the current world champions
Crespyl•4mo ago
Yes, the OpenAI Five bots won a best of three in their custom format, back in 2019. The bots won the first two games, then a third game was played which the humans won, which is the point I was trying to make (I'm not the GP).

Unless you know of another time the bots were deployed formally against a pro team more recently, which I'd love to hear about.

[0] https://web.archive.org/web/20190413210513/https://venturebe...

yeasku•4mo ago
Are bitters and cliffs disabled?
noddybear•4mo ago
Biters are disabled, but cliffs are not
kyars•4mo ago
Live-stream is epic
dnlkwk•4mo ago
This is dope. When is it appropriate to start enabling multiple agents for one player to see if they can collaborate and divide up roles?
bigjobby•4mo ago
Class. This is what Claude was designed for