news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Fire-juggling unicyclist caught performing on crossing

https://news.sky.com/story/fire-juggling-unicyclist-caught-performing-on-crossing-13504459

1•austinallegro•16s ago•0 comments

Restoring a lost 1981 Unix roguelike (protoHack) and preserving Hack 1.0.3

https://github.com/Critlist/protoHack

1•Critlist•1m ago•0 comments

GPS and Time Dilation – Special and General Relativity

https://philosophersview.com/gps-and-time-dilation/

1•mistyvales•5m ago•0 comments

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

https://github.com/writerslogic/witnessd

1•davidcondrey•5m ago•1 comments

Show HN: I built a clawdbot that texts like your crush

https://14.israelfirew.co

1•IsruAlpha•7m ago•0 comments

Scientists reverse Alzheimer's in mice and restore memory (2025)

https://www.sciencedaily.com/releases/2025/12/251224032354.htm

1•walterbell•10m ago•0 comments

Compiling Prolog to Forth [pdf]

https://vfxforth.com/flag/jfar/vol4/no4/article4.pdf

1•todsacerdoti•11m ago•0 comments

Show HN: Cymatica – an experimental, meditative audiovisual app

https://apps.apple.com/us/app/cymatica-sounds-visualizer/id6748863721

1•_august•13m ago•0 comments

GitBlack: Tracing America's Foundation

https://gitblack.vercel.app/

2•martialg•13m ago•0 comments

Horizon-LM: A RAM-Centric Architecture for LLM Training

https://arxiv.org/abs/2602.04816

1•chrsw•13m ago•0 comments

We just ordered shawarma and fries from Cursor [video]

https://www.youtube.com/shorts/WALQOiugbWc

1•jeffreyjin•14m ago•1 comments

Correctio

https://rhetoric.byu.edu/Figures/C/correctio.htm

1•grantpitt•14m ago•0 comments

Trying to make an Automated Ecologist: A first pass through the Biotime dataset

https://chillphysicsenjoyer.substack.com/p/trying-to-make-an-automated-ecologist

1•crescit_eundo•18m ago•0 comments

Watch Ukraine's Minigun-Firing, Drone-Hunting Turboprop in Action

https://www.twz.com/air/watch-ukraines-minigun-firing-drone-hunting-turboprop-in-action

1•breve•19m ago•0 comments

Free Trial: AI Interviewer

https://ai-interviewer.nuvoice.ai/

1•sijain2•19m ago•0 comments

FDA intends to take action against non-FDA-approved GLP-1 drugs

https://www.fda.gov/news-events/press-announcements/fda-intends-take-action-against-non-fda-appro...

19•randycupertino•20m ago•5 comments

Supernote e-ink devices for writing like paper

https://supernote.eu/choose-your-product/

3•janandonly•23m ago•0 comments

We are QA Engineers now

https://serce.me/posts/2026-02-05-we-are-qa-engineers-now

1•SerCe•23m ago•0 comments

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified

https://arxiv.org/abs/2602.01465

2•NBenkovich•23m ago•0 comments

Adversarial Reasoning: Multiagent World Models for Closing the Simulation Gap

https://www.latent.space/p/adversarial-reasoning

1•swyx•24m ago•0 comments

Show HN: Poddley.com – Follow people, not podcasts

https://poddley.com/guests/ana-kasparian/episodes

1•onesandofgrain•32m ago•0 comments

Layoffs Surge 118% in January – The Highest Since 2009

https://www.cnbc.com/2026/02/05/layoff-and-hiring-announcements-hit-their-worst-january-levels-si...

11•karakoram•32m ago•0 comments

Papyrus 114: Homer's Iliad

https://p114.homemade.systems/

1•mwenge•32m ago•1 comments

DicePit – Real-time multiplayer Knucklebones in the browser

https://dicepit.pages.dev/

1•r1z4•32m ago•1 comments

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

https://arxiv.org/abs/2601.14340

2•PaulHoule•34m ago•0 comments

Show HN: AI Agent Tool That Keeps You in the Loop

https://github.com/dshearer/misatay

2•dshearer•35m ago•0 comments

Why Every R Package Wrapping External Tools Needs a Sitrep() Function

https://drmowinckels.io/blog/2026/sitrep-functions/

1•todsacerdoti•35m ago•0 comments

Achieving Ultra-Fast AI Chat Widgets

https://www.cjroth.com/blog/2026-02-06-chat-widgets

2•thoughtfulchris•37m ago•0 comments

Show HN: Runtime Fence – Kill switch for AI agents

https://github.com/RunTimeAdmin/ai-agent-killswitch

1•ccie14019•40m ago•1 comments

Researchers surprised by the brain benefits of cannabis usage in adults over 40

https://nypost.com/2026/02/07/health/cannabis-may-benefit-aging-brains-study-finds/

2•SirLJ•41m ago•0 comments

Open in hackernews

Benchmark Scores Aren't Enough: A/B Testing AI in Production

https://blog.growthbook.io/how-to-a-b-test-ai-a-practical-guide/

2•royalfig•9mo ago

Comments

royalfig•9mo ago

Goodhart's Law states, "When a measure becomes a target, it ceases to be a good measure."

This applies to our favorite LLM models, too, meaning that as they optimize for scoring high on benchmarks, how do we know that's also good for real-world performance like accuracy, latency, or user engagement?

A/B testing AI models helps give you real feedback on how your LLM and its configuration is performing.