frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

You Are Here

https://brooker.co.za/blog/2026/02/07/you-are-here.html
1•mltvc•2m ago•0 comments

Why social apps need to become proactive, not reactive

https://www.heyflare.app/blog/from-reactive-to-proactive-how-ai-agents-will-reshape-social-apps
1•JoanMDuarte•3m ago•0 comments

How patient are AI scrapers, anyway? – Random Thoughts

https://lars.ingebrigtsen.no/2026/02/07/how-patient-are-ai-scrapers-anyway/
1•samtrack2019•3m ago•0 comments

Vouch: A contributor trust management system

https://github.com/mitchellh/vouch
1•SchwKatze•3m ago•0 comments

I built a terminal monitoring app and custom firmware for a clock with Claude

https://duggan.ie/posts/i-built-a-terminal-monitoring-app-and-custom-firmware-for-a-desktop-clock...
1•duggan•4m ago•0 comments

Tiny C Compiler

https://bellard.org/tcc/
1•guerrilla•6m ago•0 comments

Y Combinator Founder Organizes 'March for Billionaires'

https://mlq.ai/news/ai-startup-founder-organizes-march-for-billionaires-protest-against-californi...
1•hidden80•6m ago•1 comments

Ask HN: Need feedback on the idea I'm working on

1•Yogender78•7m ago•0 comments

OpenClaw Addresses Security Risks

https://thebiggish.com/news/openclaw-s-security-flaws-expose-enterprise-risk-22-of-deployments-un...
1•vedantnair•7m ago•0 comments

Apple finalizes Gemini / Siri deal

https://www.engadget.com/ai/apple-reportedly-plans-to-reveal-its-gemini-powered-siri-in-february-...
1•vedantnair•8m ago•0 comments

Italy Railways Sabotaged

https://www.bbc.co.uk/news/articles/czr4rx04xjpo
2•vedantnair•8m ago•0 comments

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•fanf2•10m ago•0 comments

Nintendo Wii Themed Portfolio

https://akiraux.vercel.app/
1•s4074433•14m ago•1 comments

"There must be something like the opposite of suicide "

https://post.substack.com/p/there-must-be-something-like-the
1•rbanffy•16m ago•0 comments

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

2•amichail•17m ago•0 comments

Show HN: Engineering Perception with Combinatorial Memetics

1•alan_sass•23m ago•2 comments

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

https://steamdaily.xyz
1•itshellboy•25m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
1•spenvo•25m ago•0 comments

Just Started Using AmpCode

https://intelligenttools.co/blog/ampcode-multi-agent-production
1•BojanTomic•26m ago•0 comments

LLM as an Engineer vs. a Founder?

1•dm03514•27m ago•0 comments

Crosstalk inside cells helps pathogens evade drugs, study finds

https://phys.org/news/2026-01-crosstalk-cells-pathogens-evade-drugs.html
2•PaulHoule•28m ago•0 comments

Show HN: Design system generator (mood to CSS in <1 second)

https://huesly.app
1•egeuysall•28m ago•1 comments

Show HN: 26/02/26 – 5 songs in a day

https://playingwith.variousbits.net/saturday
1•dmje•29m ago•0 comments

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

https://github.com/Paraxiom/topological-coherence
1•slye514•31m ago•1 comments

Top AI models fail at >96% of tasks

https://www.zdnet.com/article/ai-failed-test-on-remote-freelance-jobs/
5•codexon•32m ago•2 comments

The Science of the Perfect Second (2023)

https://harpers.org/archive/2023/04/the-science-of-the-perfect-second/
1•NaOH•33m ago•0 comments

Bob Beck (OpenBSD) on why vi should stay vi (2006)

https://marc.info/?l=openbsd-misc&m=115820462402673&w=2
2•birdculture•36m ago•0 comments

Show HN: a glimpse into the future of eye tracking for multi-agent use

https://github.com/dchrty/glimpsh
1•dochrty•37m ago•0 comments

The Optima-l Situation: A deep dive into the classic humanist sans-serif

https://micahblachman.beehiiv.com/p/the-optima-l-situation
2•subdomain•37m ago•1 comments

Barn Owls Know When to Wait

https://blog.typeobject.com/posts/2026-barn-owls-know-when-to-wait/
1•fintler•38m ago•0 comments
Open in hackernews

Ask HN: How do you find SOTA LLMs for a task?

1•throwaw12•6mo ago
There are thousands of models at the moment available at Hugging Face. But whenever I need a model for specific task, I am struggling to find SOTA model, can you recommend me how to find it?

I am not ML practitioner, I just need models for my work, for example for coding, I know we can use Claude/Gemini models, but sometimes I want to compare them to SOTA open source, every week something better is coming and reading articles from month ago or finding LLM leaderboard for a specific task is difficult sometimes. I think some kind of model picker already exists, but don't know where

Comments

Oras•6mo ago
I usually go to OpenRouter usage to learn that by category https://openrouter.ai/rankings

Scroll down to categories, and select from the dropdown on top right of the chart.

throwaw12•6mo ago
that's nice addition to my tool set :) thanks!

but it seems mostly reflects proprietary models (because they are easier to serve)

incomingpain•6mo ago
For open source, you're not going to see stats well online. Openhands + devstral doesnt touch the internet, so wont make it to many stats.

You can look at benchmarks.

https://livebench.ai/#/?Agentic+Coding=a

Keep scrolling until you see something your size. Deepseek R1 is nice, but 600B isnt running on my hardware. You'll also notice they arent doing everything. dominated by the Saas options.

https://huggingface.co/models

This is sorted by trending by default. This tends to help show interest but not necessarily the best.

throwaw12•6mo ago
> Deepseek R1 is nice, but 600B isnt running on my hardware.

Yeah, this is my concern as well, usually top SOTA generic models are good at many tasks, but I can't test them quickly on my machine locally. Especially when seeing claims how 32B model is outperforming proprietary models in benchmarks, I really want to test it myself in my tasks, but after some time they are dropped from news/trends and difficult to find them