frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

TSArena – Independent blind pairwise testing for AI safety

https://tsarena.ai/
1•solosevn•1h ago

Comments

solosevn•1h ago
TSArena is an independent platform for blind pairwise evaluation of AI safety behavior. Users are shown two anonymous model responses to the same safety-relevant prompt — jailbreaks, harm refusal, manipulation, medical misinfo, and more — then vote on which model handled it better. No cherry-picking, no corporate benchmarks. 500 battles live across 12 safety categories with models from OpenAI, Anthropic, Google, Meta, Mistral, and others. Built because safety evals shouldn't be graded by the companies building the models.

Mercury 2: The First Diffusion Model That 'Thinks'" [video]

https://www.youtube.com/watch?v=Bqdf6Um_8OE
1•matthewsinclair•1m ago•0 comments

Code World Models for Parameter Control in Evolutionary Algorithms

https://www.alphaxiv.org/abs/2602.22260
1•camilochs•1m ago•0 comments

ProofGateway – Collect and publish customer testimonials in minutes

2•elufadeju•8m ago•0 comments

JSON-up: Stop scattering "if" checks for old JSON formats across your codebase

https://github.com/Nano-Collective/json-up
2•mrspence•8m ago•1 comments

TV's TV (1987) & TV Games Encyclopedia (1988)

https://blog.gingerbeardman.com/2026/03/01/tvs-tv-1987-and-tv-games-encyclopedia-1988/
2•msephton•17m ago•0 comments

Nvidia and Global Telecom Leaders Commit to Build 6G on AI-Native Platforms

https://nvidianews.nvidia.com/news/nvidia-and-global-telecom-leaders-commit-to-build-6g-on-open-a...
3•zinekeller•22m ago•0 comments

Vinext Explained: Rebuilding Next.js with AI in One Week (4x Faster Builds)Video

https://www.youtube.com/watch?v=AF3Rr4MENCo
2•emot•23m ago•0 comments

AI agent with 2 deps that uses Shannon Entropy to decide when to act vs. ask

https://github.com/borhen68/picoagents
2•borhensaidi•27m ago•2 comments

Online course about buying hotels

https://www.myfirsthotel.com/
2•bhagyash•27m ago•1 comments

Ask HN: How will most Anthropic customers respond to the threats by the govt?

2•Poomba•31m ago•2 comments

For Sale: The Last Honda V10 Ayrton Senna Ever Raced (2025)

https://silodrome.com/last-honda-v10-ayrton-senna-raced/
3•naves•33m ago•0 comments

Editor at 184-y/O Cleveland Plain Dealer pushes to let AI draft news articles

https://www.washingtonpost.com/technology/2026/03/01/ai-journalism-writing-cleveland-plain-dealer/
2•bookofjoe•36m ago•1 comments

An Interview with the AI They Called a National Security Threat

https://www.woodrow.fyi/p/a-letter-from-inside-the-machine
3•heywoods•40m ago•0 comments

Researchers Deanonymize Reddit and Hacker News Users at Scale

https://threatroad.substack.com/p/researchers-deanonymize-reddit-and
6•hk_flying_gear•42m ago•1 comments

California wants heat pumps. High power bills might get in the way

https://www.latimes.com/california/story/2026-03-01/california-wants-millions-of-heat-pumps-high-...
3•dangle1•42m ago•0 comments

Claude Prompt to Find Inefficiencies in LLM Usage

https://www.maniac.ai/slm-audit
2•dhruv_m•43m ago•1 comments

The Two Kinds of Error

https://evanhahn.com/the-two-kinds-of-error/
2•zdw•44m ago•0 comments

Show HN: Tired of making accounts to split a pizza bill, I built Dividdy

https://dividdy.com/en
2•jezzlucena•44m ago•0 comments

Thaura

https://thaura.ai
3•abdelhousni•45m ago•0 comments

The Agentic Dispatch: The Last Edition

https://the-agentic-dispatch.com/the-last-edition/
3•greensleeves123•47m ago•1 comments

Show HN: Logira – eBPF runtime auditing for AI agent runs

https://github.com/melonattacker/logira
2•melonattacker•48m ago•0 comments

Show HN: Tech Digest – Top Products from PH/HN

https://techdigest.live/
2•vaibhav0806•53m ago•0 comments

Podcast Listenership Outranks Talk Radio for the First Time

https://www.cnet.com/tech/services-and-software/podcasts-officially-outrank-talk-radio-for-the-fi...
3•geox•53m ago•0 comments

Show HN: Gala – Sealed types, pattern matching, and monads for Go

https://github.com/martianoff/gala
2•mmcodes•55m ago•2 comments

1978: Could You Survive Without Modern Technology? [video]

https://www.youtube.com/watch?v=WXZpjZidCNk
3•sys_64738•56m ago•0 comments

FCaptcha – A modern CAPTCHA system designed to detect everything

https://github.com/WebDecoy/FCaptcha
2•cport1•57m ago•0 comments

Right-sizes LLM models to your system's RAM, CPU, and GPU

https://github.com/AlexsJones/llmfit
4•bilsbie•58m ago•0 comments

Tell HN: Discover using old phone numbers from data broker for SMS 2FA

2•throwawaycDpvY•1h ago•0 comments

Show HN: I built speedmux, a libghostty-powered terminal multiplexer

https://github.com/webforspeed/speedmux
2•n89nanda•1h ago•1 comments

TeX Live 2026 is released

https://tug.org/pipermail/tex-live/2026-March/052232.html
5•gucci-on-fleek•1h ago•2 comments