frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
1•facundo_olano•1m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•2m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•2m ago•0 comments

Google staff call for firm to cut ties with ICE

https://www.bbc.com/news/articles/cvgjg98vmzjo
2•tartoran•2m ago•0 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•3m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•3m ago•0 comments

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

https://www.iplotcsv.com/demo
1•maxmoq•4m ago•0 comments

There's no such thing as "tech" (Ten years later)

https://www.anildash.com/2026/02/06/no-such-thing-as-tech/
1•headalgorithm•4m ago•0 comments

List of unproven and disproven cancer treatments

https://en.wikipedia.org/wiki/List_of_unproven_and_disproven_cancer_treatments
1•brightbeige•5m ago•0 comments

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•5m ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•8m ago•1 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•10m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•14m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•14m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•14m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•14m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•16m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•16m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•16m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•16m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•16m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•19m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
2•geox•21m ago•1 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•22m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
2•fainir•24m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•25m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•27m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•31m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
3•Brajeshwar•32m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
2•Brajeshwar•32m ago•0 comments
Open in hackernews

Introducing TiānshūBench (天书Bench)

https://jeepytea.github.io/general/introduction/2025/05/29/tianshubenchintro.html
4•chromaton•8mo ago

Comments

chiwilliams•8mo ago
Cool project! I have a couple of questions that would be nice in the writeup: * How did you generate your example problems? Did you take an existing benchmark? Or did you have LLMs generate the problems? * Do you have any thought to adding a second "base programming language" to alter? I'm not sure that there's enough variation as there is. (Another thought would be to generate 4 or 5 different new languages, each quite different, and then run the benchmark on each of those languages? I'm not sure how much the fact that it is randomly generated each time matters that much?)

But overall, a clever idea!

chromaton•8mo ago
Generating the problems: I just thought up a few simple things that the computer might be able to do. In the future, I hope to expand to more complex problems, based upon common business situations: reading CSVs, parsing data, etc. I'll probably add new tests once I get multi-shot and reliability working correctly.

New base programming languages would be great, but what would be even better is some sort of meta-language where many features can be turned on or off, rather than just scrambling the keywords like I do now.

I did some vibe testing with a current frontier model, and it gets quite confused and keeps insisting that there's a control structure that definitely doesn't exist in the TiānshūBench language with seed=1.

JSR_FDED•8mo ago
Would it be useful to generate Procedural, OOP and Functional variations of the problems?
chromaton•8mo ago
Yes, it would be fantastic to have more languages to test off of. I picked the base language I did (Mamba) because it was easy to modify and integrate into Python.