frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
1•Brajeshwar•4m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
2•Brajeshwar•4m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
1•Brajeshwar•4m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•7m ago•0 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
1•righthand•10m ago•0 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•11m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•11m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
2•vinhnx•12m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•17m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•21m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•26m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•27m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•28m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
5•okaywriting•34m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•37m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•38m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•39m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•40m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•40m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•41m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•41m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•45m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•45m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•46m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•46m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•55m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•55m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
2•surprisetalk•57m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•57m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
2•surprisetalk•57m ago•0 comments
Open in hackernews

Show HN: Atlas: Independent Evals and Benchmarking for Generative AI Models

https://app.layerlens.ai/
1•Arch223•8mo ago
We are LayerLens, a project focused on building better resources for independent, transparent evals for frontier AI models. Atlas is a community resource intended to provide insights about the performance of the top foundational models through independent evals on benchmarks such as MATH, HumanEval, and MMLU. LayerLens is a team of engineers and data scientists who have been constantly frustrated by the lack of independent verification for LLM performance. Most benchmarks come from the model creators themselves, and for developers, building an independent evaluation pipeline is often more trouble than it is worth. Open-source leaderboards, while admirable, often do not provide enough transparency, and are often too scientific for the average user. While evals have historically been a tool to measure the proverbial progress toward AGI, they have become increasingly relevant for validating LLM performance. Large enterprise teams and independent hackers alike use evals as a way to select the right model for a particular use-case, all while depending on singular “accuracy” metrics. Atlas is an LLM analytics leaderboard that is both simple and highly detailed. You can view the top models, sorted by region, vendor type, or a particular use-case, via evaluation spaces. You can use the battleground to compare two models on an individual benchmark, getting prompt by prompt comparisons for each entry. For any individual evaluation run, you can get a clean summary of model performance on individual subsets. And finally, each model page has its own dedicated analytics and information section. This is only our first iteration of the product. We eventually want to release the same suite for custom models, agents, evals and more. We will be around to answer any questions on our product!