frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Trained an LLM to predict "What will Trump do?"

https://huggingface.co/LightningRodLabs/Trump-Forecaster
9•bturtel•1h ago
Hey HN! I RL-tuned an open-source LLM (gpt-oss-120b — 120B MoE, but only 5.1B active params) to predict "What will Trump do?" in any situation, trained on nothing but public news collected automatically from search queries. The trained model beats GPT-5, and both dataset and trained model are open sourced.

Data generation: Generated 2,108 binary forecasting questions from just a search query and a date range using the Lightning Rod SDK (https://github.com/lightning-rod-labs/lightningrod-python-sd...). Questions are generated from historic news articles — like "Will Trump impose 25% tariffs on Mexico by March 1?" — and resolved by checking what actually happened after the deadline. No human annotation — the whole pipeline is automated.

Training: GRPO with Brier score as the reward signal. LoRA rank 32, 50 training steps.

Results: Slight accuracy edge over GPT-5 (Brier 0.194 vs 0.200), but big gains in calibration — the RL-tuned model produces much better probabilities (ECE 0.079 vs 0.091).

Dataset: https://huggingface.co/datasets/LightningRodLabs/WWTD-2025

This is a fully automated way to spin up domain expert LLMs from public web data with just a few search queries, no labeling/annotation required.

I’d love any feedback, or suggestions for what domain expert to train next!

Comments

sleno•1h ago
interesting...what were some examples of things trump did that your model got right and gpt-5 got wrong?
bturtel•57m ago
Great question! It's probabilistic so not really "right vs wrong" on any single question, but who better estimated the likelihood. One big difference shows up when there's no useful context - we ran the same eval WITHOUT including any useful up-to-date context with questions. In this case, GPT-5 stays overconfident and its BSS drops to -11.3% (vs -4.3% ours) - worse than just guessing the base rate. So one advantage of the RL training is just learning to know what you don't know, and identify when there's real signal.

There is no AI in accountability

https://5blockchains.com/posts/accountability/
1•betareducer•2m ago•0 comments

Google vs. SerpApi: We're Filing a Motion to Dismiss

https://serpapi.com/blog/google-v-serpapi-motion-to-dismiss-why-were-in-the-right/
1•paigealyse•3m ago•0 comments

Google vs. SerpApi: We're Filing a Motion to Dismiss

https://serpapi.com/blog/google-v-serpapi-motion-to-dismiss-why-were-in-the-right/
1•SerpApi•3m ago•0 comments

Predator spyware exploits SpringBoard to block iOS recording

https://appleinsider.com/articles/26/02/19/iphone-camera-microphone-dot-can-be-suppressed-if-your...
2•bookofjoe•4m ago•0 comments

Show HN: Running Debian on the OpenWrt One

https://github.com/sjoerdsimons/openwrt-one-debian
4•mfilion•5m ago•0 comments

Performance of Deep Material Networks for Multiscale Material Modeling

https://arxiv.org/abs/2602.07192
1•PaulHoule•7m ago•0 comments

Show HN: Skills – Making AI coding tools aware of government standards

https://anneschuth.nl/en/2026/02/20/skills/
2•aschuth•9m ago•1 comments

Show HN: Segspec (CLI) K8s NetworkPolicies from App Configs (Go)

https://github.com/dormstern/segspec
1•dormstern•9m ago•1 comments

German Grooms, Irish Brides: How Immigrant Communities Married into Each Other

https://www.points-of-entry.com/p/marriage-and-the-melting-pot-part-2
1•CGMthrowaway•10m ago•0 comments

Programming Is Forgetting: Toward a New Hacker Ethic (2016)

http://opentranscripts.org/transcript/programming-forgetting-new-hacker-ethic/
1•laurex•11m ago•0 comments

Michael Abrash's Zen of Assembly Language (1990)

https://github.com/jagregory/abrash-zen-of-asm
1•tosh•11m ago•0 comments

Wikipedia bans Archive.today after site executed DDoS and altered web captures

https://arstechnica.com/tech-policy/2026/02/wikipedia-bans-archive-today-after-site-executed-ddos...
3•nobody9999•11m ago•0 comments

Show HN: LLMWise – Compare, Blend, and Judge LLM Outputs from One API

https://llmwise.ai/
1•dm118•12m ago•0 comments

Do We Need a Programming Language Built Just for AI Agents?

https://app.writtte.com/read/43JMxH9
1•lasgawe•12m ago•0 comments

From Software Guilds to Software Factories

https://wjgilmore.com/articles/goodbye-software-guilds-hello-software-factories
1•wjgilmore•14m ago•0 comments

Five Memorable Books About Programming

https://prog21.dadgum.com/19.html
1•tosh•15m ago•0 comments

Warden

https://github.com/getsentry/warden
1•jshchnz•15m ago•0 comments

Show HN: Together, multiplayer drawing chat room

https://together.tldraw.com/
2•steveruizok•16m ago•1 comments

ClawDuck

https://www.clawduck.com/
1•vandanaTalentR•16m ago•0 comments

Cloudflare Outage

https://downdetector.com/status/cloudflare/
5•xmprt•17m ago•1 comments

A collection of scripts to modernize CLI file management

https://github.com/terpinedream/Bashd
2•terpinedream•18m ago•0 comments

Show HN: An offline-first ski analysis app

1•skicoachapp•18m ago•1 comments

The Most Important Decisions Are Non-Technical

https://prog21.dadgum.com/137.html
1•tosh•18m ago•1 comments

Wisdom of the Crowd: How Network Topology Distorts Collective Perception

https://arxiv.org/abs/2602.17146
1•Anon84•19m ago•0 comments

7-Eleven bets on Australian stores to show it can grow globally

https://www.japantimes.co.jp/business/2026/02/19/companies/seven-eleven-australia/
1•mikhael•20m ago•0 comments

Show HN: Locational Variable Theory – An informational framework for physics

https://github.com/TobeyStar/LVT-Theoretical-Physics-Information-Space
1•TobeyStar•20m ago•1 comments

Show HN: Vibe coded iOS workout app with Apple Watch support

https://apps.apple.com/us/app/fitwit-ai-personal-trainer/id6757002413
3•avsavani•23m ago•0 comments

Stateful Agents and Basic Memory

https://www.danielcorin.com/posts/2026/stateful-agents/
2•danielcorin•23m ago•0 comments

Show HN: SQL Query Optimizer

https://github.com/SubhanHakverdiyev/OptimizeQL/blob/main/README.md
1•hura17•24m ago•0 comments

Ask HN: What is the current adoption scenario for background coding agents?

1•daemon_9009•24m ago•1 comments