frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Evals in 2025: benchmarks to build models people can use

https://github.com/huggingface/evaluation-guidebook/blob/main/yearly_dives/2025-evaluations-for-useful-models.md
20•jxmorris12•2d ago

Comments

aplassard•2h ago
I think cost should also be a direct consideration. Model performance varies wildly on benchmarks when given a budget. https://substack.com/@andrewplassard/note/p-173487568?r=2fqo...
elemeno•1h ago
I’ve been building a tool to help with this - Safety Evals In-a-Box [https://github.com/elemeno/seibox]. It’s a work in progress and not quite ready for public release, but its a multi-model eval runner (primarily for safety oriented evals, but no reason why it can run other types as well!) and includes cost and latency in it reporting.

Cormac McCarthy's tips on how to write a science paper (2019) [pdf]

https://gwern.net/doc/science/2019-savage.pdf
113•surprisetalk•4h ago•37 comments

Designing NotebookLM

https://jasonspielman.com/notebooklm
12•vinhnx•1h ago•0 comments

Seattle Ultrasonics: Ultrasonic Chef's Knife

https://seattleultrasonics.com/
31•hemloc_io•2h ago•17 comments

Scream cipher

https://sethmlarson.dev/scream-cipher
204•alexmolas•2d ago•81 comments

Living microbial cement supercapacitors with reactivatable energy storage

https://www.cell.com/cell-reports-physical-science/fulltext/S2666-3864(25)00409-6
62•PaulHoule•4h ago•35 comments

The LLM Lobotomy

https://learn.microsoft.com/en-us/answers/questions/5561465/the-llm-lobotomy
20•sgt3v•21m ago•0 comments

Images over DNS

https://dgl.cx/2025/09/images-over-dns
111•dgl•6h ago•31 comments

MapSCII – World Map in Terminal

https://github.com/rastapasta/mapscii
89•_august•2d ago•14 comments

Are touchscreens in cars dangerous?

https://www.economist.com/science-and-technology/2025/09/19/are-touchscreens-in-cars-dangerous
78•Brajeshwar•2h ago•68 comments

Less is safer: How Obsidian reduces the risk of supply chain attacks

https://obsidian.md/blog/less-is-safer/
469•saeedesmaili•20h ago•220 comments

Bezier Curve as Easing Function in C++

https://asawicki.info/news_1790_bezier_curve_as_easing_function_in_c
30•ibobev•5h ago•4 comments

Claude Can (Sometimes) Prove It

https://www.galois.com/articles/claude-can-sometimes-prove-it
156•lairv•3d ago•45 comments

Escapee pregnancy test frogs colonised Wales for 50 years

https://www.bbc.com/news/uk-wales-44886585
80•Luc•4d ago•31 comments

Is Zig's New Writer Unsafe?

https://www.openmymind.net/Is-Zigs-New-Io-Unsafe/
105•ibobev•4h ago•88 comments

Show HN: Math2Tex – Convert handwritten math and complex notes to LaTeX text

23•leoyixing•3d ago•11 comments

Scientists find that ice generates electricity when bent

https://phys.org/news/2025-09-scientists-ice-generates-electricity-bent.html
61•isaacfrond•3d ago•18 comments

If all the world were a monorepo

https://jtibs.substack.com/p/if-all-the-world-were-a-monorepo
233•sebg•4d ago•63 comments

Evals in 2025: benchmarks to build models people can use

https://github.com/huggingface/evaluation-guidebook/blob/main/yearly_dives/2025-evaluations-for-u...
20•jxmorris12•2d ago•2 comments

The Gentrification of Videogame History

https://felipepepe.medium.com/the-gentrification-of-video-game-history-dfe11f1e08ae
3•akkartik•2d ago•0 comments

The best YouTube downloaders, and how Google silenced the press

https://windowsread.me/p/best-youtube-downloaders
490•Leftium•1d ago•214 comments

PyPI Blog: Token Exfiltration Campaign via GitHub Actions Workflows

https://blog.pypi.org/posts/2025-09-16-github-actions-token-exfiltration/
59•miketheman•3d ago•17 comments

Ants that seem to defy biology – They lay eggs that hatch into another species

https://www.smithsonianmag.com/smart-news/these-ant-queens-seem-to-defy-biology-they-lay-eggs-tha...
452•sampo•1d ago•146 comments

Visa holders on vacation have 15 hours to return to US or pay $100k fee

https://timesofindia.indiatimes.com/technology/tech-news/microsoft-has-a-24-hour-deadline-warning...
167•irthomasthomas•5h ago•221 comments

Show HN: FocusStream – Focused, distraction-free YouTube for learners

https://focusstream.media
70•pariharAshwin•11h ago•39 comments

LLM-Deflate: Extracting LLMs into Datasets

https://www.scalarlm.com/blog/llm-deflate-extracting-llms-into-datasets/
50•gdiamos•11h ago•25 comments

Show HN: Zedis – A Redis clone I'm writing in Zig

https://github.com/barddoo/zedis
143•barddoo•20h ago•93 comments

Show HN: WeUseElixir - Elixir project directory

https://weuseelixir.com/
198•taddgiles•22h ago•50 comments

IG Nobel Prize Winners 2025

https://improbable.com/ig/winners/
127•JeremyTheo•7h ago•35 comments

Internet Archive's big battle with music publishers ends in settlement

https://arstechnica.com/tech-policy/2025/09/internet-archives-big-battle-with-music-publishers-en...
350•coloneltcb•4d ago•148 comments

What Makes System Calls Expensive: A Linux Internals Deep Dive

https://blog.codingconfessions.com/p/what-makes-system-calls-expensive
45•rbanffy•4h ago•4 comments