frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

NextMatch – 5-minute video speed dating to reduce ghosting

https://nextmatchdating.netlify.app/
1•Halinani8•5s ago•1 comments

Personalizing esketamine treatment in TRD and TRBD

https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1736114
1•PaulHoule•1m ago•0 comments

SpaceKit.xyz – a browser‑native VM for decentralized compute

https://spacekit.xyz
1•astorrivera•2m ago•1 comments

NotebookLM: The AI that only learns from you

https://byandrev.dev/en/blog/what-is-notebooklm
1•byandrev•2m ago•1 comments

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

https://github.com/ClickHouse/postgres-clickhouse-stack
1•saisrirampur•3m ago•0 comments

Game Boy Advance d-pad capacitor measurements

https://gekkio.fi/blog/2026/game-boy-advance-d-pad-capacitor-measurements/
1•todsacerdoti•3m ago•0 comments

South Korean crypto firm accidentally sends $44B in bitcoins to users

https://www.reuters.com/world/asia-pacific/crypto-firm-accidentally-sends-44-billion-bitcoins-use...
1•layer8•4m ago•0 comments

Apache Poison Fountain

https://gist.github.com/jwakely/a511a5cab5eb36d088ecd1659fcee1d5
1•atomic128•5m ago•1 comments

Web.whatsapp.com appears to be having issues syncing and sending messages

http://web.whatsapp.com
1•sabujp•6m ago•2 comments

Google in Your Terminal

https://gogcli.sh/
1•johlo•7m ago•0 comments

Shannon: Claude Code for Pen Testing: #1 on Github today

https://github.com/KeygraphHQ/shannon
1•hendler•8m ago•0 comments

Anthropic: Latest Claude model finds more than 500 vulnerabilities

https://www.scworld.com/news/anthropic-latest-claude-model-finds-more-than-500-vulnerabilities
2•Bender•12m ago•0 comments

Brooklyn cemetery plans human composting option, stirring interest and debate

https://www.cbsnews.com/newyork/news/brooklyn-green-wood-cemetery-human-composting/
1•geox•12m ago•0 comments

Why the 'Strivers' Are Right

https://greyenlightenment.com/2026/02/03/the-strivers-were-right-all-along/
1•paulpauper•14m ago•0 comments

Brain Dumps as a Literary Form

https://davegriffith.substack.com/p/brain-dumps-as-a-literary-form
1•gmays•14m ago•0 comments

Agentic Coding and the Problem of Oracles

https://epkconsulting.substack.com/p/agentic-coding-and-the-problem-of
1•qingsworkshop•15m ago•0 comments

Malicious packages for dYdX cryptocurrency exchange empties user wallets

https://arstechnica.com/security/2026/02/malicious-packages-for-dydx-cryptocurrency-exchange-empt...
1•Bender•15m ago•0 comments

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

https://github.com/pheonix-delta/axiom-voice-agent
1•shubham-coder•15m ago•0 comments

Penisgate erupts at Olympics; scandal exposes risks of bulking your bulge

https://arstechnica.com/health/2026/02/penisgate-erupts-at-olympics-scandal-exposes-risks-of-bulk...
4•Bender•16m ago•0 comments

Arcan Explained: A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
1•fanf2•17m ago•0 comments

What did we learn from the AI Village in 2025?

https://theaidigest.org/village/blog/what-we-learned-2025
1•mrkO99•18m ago•0 comments

An open replacement for the IBM 3174 Establishment Controller

https://github.com/lowobservable/oec
1•bri3d•20m ago•0 comments

The P in PGP isn't for pain: encrypting emails in the browser

https://ckardaris.github.io/blog/2026/02/07/encrypted-email.html
2•ckardaris•22m ago•0 comments

Show HN: Mirror Parliament where users vote on top of politicians and draft laws

https://github.com/fokdelafons/lustra
1•fokdelafons•23m ago•1 comments

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

1•Chance-Device•24m ago•0 comments

We Mourn Our Craft

https://nolanlawson.com/2026/02/07/we-mourn-our-craft/
1•ColinWright•27m ago•0 comments

Jim Fan calls pixels the ultimate motor controller

https://robotsandstartups.substack.com/p/humanoids-platform-urdf-kitchen-nvidias
1•robotlaunch•31m ago•0 comments

Exploring a Modern SMTPE 2110 Broadcast Truck with My Dad

https://www.jeffgeerling.com/blog/2026/exploring-a-modern-smpte-2110-broadcast-truck-with-my-dad/
1•HotGarbage•31m ago•0 comments

AI UX Playground: Real-world examples of AI interaction design

https://www.aiuxplayground.com/
1•javiercr•32m ago•0 comments

The Field Guide to Design Futures

https://designfutures.guide/
1•andyjohnson0•32m ago•0 comments
Open in hackernews

Ask HN: Agent evaluations, what is everything I should know?

3•akira_067•2mo ago
I'm currently building coding agents, and wondering what the standard is for creating and running evals for most people? I gather that the tasks and their definitions will be dramatically different across domains and instances, so I'm not hoping for a one size fits all. Just... what actually works for you in practice?

Comments

adastra22•2mo ago
The capabilities of the tool matter more. Claude Code, Codex, Cursor CLI all have different feature sets. This usually determines the choice more than base model capabilities.
esafak•2mo ago
They felt similar enough to me. One feature that did make a difference to me is when follow up prompts are acted on; sooner or later.
adastra22•2mo ago
If all you are doing is chatting with the model, they're effectively the same. If you are actually building agentic workflows, they differ immensely in supported features. Codex doesn't support subagent personalities, for example, and until this week Claude Code didn't support generating structured responses.
akira_067•2mo ago
You think so? What features are differentiated enough to warrant that?

Seems like they all have tools for reading files, editing, running shell commands.

Cursor has linting access, but this can be added to claude with hooks I think.

adastra22•2mo ago
Subagent personalities, skills, structured output generation, customizable context cleanup, etc.
akira_067•2mo ago
Most of those seem kind of useless... I probably do not know how to use them correctly, so, what would you recommend by way of subagents and skills?