frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Data Has Weight but Only on SSDs

https://cubiclenate.com/2026/03/04/data-has-weight-but-only-on-ssds-blathering/
1•LorenDB•1m ago•0 comments

With Neo, Apple Goes After Windows 11

https://om.co/2026/03/04/apple-goes-after-window-11-with-neo/
1•tosh•2m ago•0 comments

Show HN: SpacePill – Better macOS Space Context Switching

1•jakequist•3m ago•0 comments

Show HN: I built a prediction market that predicts itself

https://www.follymarket.com
1•pkundr•3m ago•0 comments

The Next Version of Curling IO

https://curling.io/blog/the-next-version-of-curling-io
1•PaulHoule•5m ago•0 comments

Fast IP and GPS to Location API (50ms, Global, 99.9% Uptime)

https://www.jeleo.zone.id/
1•wtronk•5m ago•1 comments

"Personal Data": more than a definition, a quasi-constitutional stake in EU

https://www.europeanlawblog.eu/pub/yc0l0slk/release/1
1•fanf2•5m ago•0 comments

IMB Piracy and Armed Robbery Map 2025

https://icc-ccs.org/2025-2/
1•michaefe•6m ago•0 comments

New Emoji: Distorted Face

https://jenniferdaniel.substack.com/p/new-emoji-distorted-face
1•ChrisArchitect•6m ago•0 comments

This job has become the ultimate case study why AI won't replace human workers

https://www.cnn.com/2026/02/09/tech/ai-replacing-jobs-concerns-radiology
1•mhb•7m ago•0 comments

Learnings from a No-Code Lib: Keep the Spec Driven Development Triangle in Sync

https://www.dbreunig.com/2026/03/04/the-spec-driven-development-triangle.html
1•dbreunig•8m ago•0 comments

Show HN: I made Claude Code block my distractions and track everything I ship

https://twitter.com/daxaur/status/2029258604084158559
1•daxaur•9m ago•1 comments

My MCP Server Setup: A Practical Guide to Wiring AI into Everything

https://crunchtools.com/my-mcp-server-setup-practical-guide/
1•abdelhousni•9m ago•0 comments

Man Arrested for Plotting with Others to Murder or Kidnap Two Dissidents Abroad

https://www.justice.gov/usao-sdny/pr/man-arrested-plotting-others-murder-or-kidnap-two-victims-ab...
1•737min•9m ago•0 comments

Does Altman Deserve the Heat?

https://tapestry.news/tech/altman-heat/
1•sonalidee•9m ago•1 comments

Harjus v4 adds kernel bypass and more

https://shufflingbytes.com/posts/harjus-release-4.0.0/
1•ValtteriL•10m ago•0 comments

Show HN: TerminalNexus – Turn CLI commands into reusable buttons (Windows)

1•danhof_sss•10m ago•0 comments

Why Autonomous Agents Failed the Initial Hype: An AutoGen Retrospective

https://www.youtube.com/watch?v=2cnxea3xkzM
1•alexchaomander•10m ago•1 comments

Rob Grant Obituary on Ganymede and Titan

https://www.ganymede.tv/2026/03/obituary-rob-grant/
1•nephihaha•11m ago•1 comments

Agent-experience: visual reference to patterns, surfaces, and infrastructure

https://github.com/ygwyg/agent-experience
1•simonpure•11m ago•0 comments

C++ Reflection: Another Monad

https://www.elbeno.com/blog/?p=1813
1•ingve•12m ago•0 comments

Invoicesio.app – Invoice and billing for freelancers and small businesses

https://invoicesio.app/
1•dimitrisal•13m ago•1 comments

AWS-hosted tech providers urge Middle East customers to fail over now

https://www.theregister.com/2026/03/04/aws_saas_middle_east_customer_warnings/
2•Bender•13m ago•0 comments

Dev stunned by $82K Gemini bill after unknown API key thief goes to town

https://www.theregister.com/2026/03/03/gemini_api_key_82314_dollar_charge/
1•Bender•13m ago•1 comments

Faster C software with Dynamic Feature Detection

https://gist.github.com/jjl/d998164191af59a594500687a679b98d
2•todsacerdoti•14m ago•0 comments

Get Paid for Good Posts

https://treechat.com/
3•mitya777•15m ago•0 comments

Up to 10% of Firefox crashes are due to bad memory [thread]

https://mas.to/@gabrielesvelto/116171753263415921
1•MBCook•15m ago•0 comments

With developer verification, Google's Apple envy threatens Android's open legacy

https://arstechnica.com/gadgets/2026/03/with-developer-verification-googles-apple-envy-threatens-...
1•Bender•15m ago•0 comments

Ask HN: Does Claude Code's abilities fluctuate for you too?

1•ammerfest•15m ago•0 comments

CodeRabbit tops the F1 score in Martian's code review benchmarks

https://www.coderabbit.ai/blog/coderabbit-tops-martian-code-review-benchmark
1•smb06•17m ago•0 comments
Open in hackernews

Show HN: Novum – Automated ML Research Pipeline with Anti-Fabrication Guards

https://github.com/euanai/novum
1•euanai•1h ago

Comments

euanai•1h ago
Hi HN! I'm the author.

Novum is a Claude Code extension that runs an autonomous ML research loop with mechanical guardrails designed to reduce result fabrication.

The key idea is that instead of relying on prompts like "don't hallucinate", the system enforces constraints mechanically (e.g., preventing edits to protected result files and enforcing phase gates in the research pipeline).

In a recent test run, a single /research command ran autonomously for about 30 hours: 10 hypotheses tested, 4 iteration cycles, and one champion solution selected.

Happy to answer questions or hear feedback on the guard design and research workflow.

isaackeitor•1h ago
Two things I'm curious about:

- How strict are the phase gates? Like, is it a hard checklist or can the system be more lenient depending on the task? - When picking the champion solution out of 10 hypotheses, what's actually being measured?

euanai•56m ago
Great questions!

Phase gates are hard — it's a PreToolUse hook (phase-gate-guard.js) that checks prerequisites before allowing state.json updates. If something's missing, the write gets denied. Like Phase 1→2 won't pass without literature-review.md (>2000 words), ≥10 papers in metadata, and a references.bib. Phase 6→7 needs a completed tournament with a champion. No exceptions — the agent just can't advance. There are some softer warnings too, but the main gates are hard blocks.

For champion selection — it's Successive Halving. All hypotheses compete in Round 1 (15% of GPU budget), top half survive to Round 2 (30%), champion gets Round 3 (55%). Each round eliminates the bottom half by score. The score is a weighted mix of metric improvement, mechanism signal quality, compute efficiency, and novelty — weights shift depending on venue target (oral cares more about novelty, poster cares more about raw metric gains).