frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The Janitor on Mars

https://www.newyorker.com/magazine/1998/10/26/the-janitor-on-mars
1•evo_9•1m ago•0 comments

Bringing Polars to .NET

https://github.com/ErrorLSC/Polars.NET
2•CurtHagenlocher•3m ago•0 comments

Adventures in Guix Packaging

https://nemin.hu/guix-packaging.html
1•todsacerdoti•4m ago•0 comments

Show HN: We had 20 Claude terminals open, so we built Orcha

1•buildingwdavid•5m ago•0 comments

Your Best Thinking Is Wasted on the Wrong Decisions

https://www.iankduncan.com/engineering/2026-02-07-your-best-thinking-is-wasted-on-the-wrong-decis...
1•iand675•5m ago•0 comments

Warcraftcn/UI – UI component library inspired by classic Warcraft III aesthetics

https://www.warcraftcn.com/
1•vyrotek•6m ago•0 comments

Trump Vodka Becomes Available for Pre-Orders

https://www.forbes.com/sites/kirkogunrinde/2025/12/01/trump-vodka-becomes-available-for-pre-order...
1•stopbulying•7m ago•0 comments

Velocity of Money

https://en.wikipedia.org/wiki/Velocity_of_money
1•gurjeet•10m ago•0 comments

Stop building automations. Start running your business

https://www.fluxtopus.com/automate-your-business
1•valboa•14m ago•1 comments

You can't QA your way to the frontier

https://www.scorecard.io/blog/you-cant-qa-your-way-to-the-frontier
1•gk1•15m ago•0 comments

Show HN: PalettePoint – AI color palette generator from text or images

https://palettepoint.com
1•latentio•16m ago•0 comments

Robust and Interactable World Models in Computer Vision [video]

https://www.youtube.com/watch?v=9B4kkaGOozA
2•Anon84•19m ago•0 comments

Nestlé couldn't crack Japan's coffee market.Then they hired a child psychologist

https://twitter.com/BigBrainMkting/status/2019792335509541220
1•rmason•21m ago•0 comments

Notes for February 2-7

https://taoofmac.com/space/notes/2026/02/07/2000
2•rcarmo•22m ago•0 comments

Study confirms experience beats youthful enthusiasm

https://www.theregister.com/2026/02/07/boomers_vs_zoomers_workplace/
2•Willingham•29m ago•0 comments

The Big Hunger by Walter J Miller, Jr. (1952)

https://lauriepenny.substack.com/p/the-big-hunger
2•shervinafshar•30m ago•0 comments

The Genus Amanita

https://www.mushroomexpert.com/amanita.html
1•rolph•35m ago•0 comments

We have broken SHA-1 in practice

https://shattered.io/
9•mooreds•36m ago•2 comments

Ask HN: Was my first management job bad, or is this what management is like?

1•Buttons840•37m ago•0 comments

Ask HN: How to Reduce Time Spent Crimping?

2•pinkmuffinere•38m ago•0 comments

KV Cache Transform Coding for Compact Storage in LLM Inference

https://arxiv.org/abs/2511.01815
1•walterbell•43m ago•0 comments

A quantitative, multimodal wearable bioelectronic device for stress assessment

https://www.nature.com/articles/s41467-025-67747-9
1•PaulHoule•45m ago•0 comments

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

https://www.wsj.com/world/india/why-big-tech-is-throwing-cash-into-india-in-quest-for-ai-supremac...
2•saikatsg•45m ago•0 comments

How to shoot yourself in the foot – 2026 edition

https://github.com/aweussom/HowToShootYourselfInTheFoot
2•aweussom•45m ago•0 comments

Eight More Months of Agents

https://crawshaw.io/blog/eight-more-months-of-agents
4•archb•47m ago•0 comments

From Human Thought to Machine Coordination

https://www.psychologytoday.com/us/blog/the-digital-self/202602/from-human-thought-to-machine-coo...
1•walterbell•48m ago•0 comments

The new X API pricing must be a joke

https://developer.x.com/
1•danver0•49m ago•0 comments

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

https://rma-dashboard.bukhari-kibuka7.workers.dev/
1•bumahkib7•49m ago•0 comments

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

https://github.com/2015xli/jqassistant-graph-rag
1•artigent•54m ago•0 comments

Python Only Has One Real Competitor

https://mccue.dev/pages/2-6-26-python-competitor
4•dragandj•55m ago•0 comments
Open in hackernews

Show HN: 83 browser-use trajectories, visualized

https://trails-red.vercel.app/viewer
7•wayy•2w ago
Hey all, Justin here. I previously built Phind, the AI search engine for developers.

One of the biggest problems we had there was figuring out what went wrong with bad searches. We had tons of searches per day, but less than 1% of users gave any explicit feedback. So we were either manually digging through searches or making general system improvements and hoping they helped.

This problem gets harder with agents. Traces are longer and more complex. It takes more effort to review them, so I'm building a tool that lets you analyze LLM outputs directly to help developers of LLM apps and agents understand where things are breaking and why.

I've put together a demo using browser-use agent traces (gpt-5): https://trails-red.vercel.app/viewer

It's early, but I have lots of ideas - live querying of past failures for currently-running agents, preference models to expand sparse signal data.

Would love feedback on the demo. Also if you're building agents and have 10k+ traces per day that you're not looking at but would like to, I'd love to talk.

Comments

Johnny_Bonk•2w ago
This is a cool project, I've also been trying to find some sort of leaderboard or benchmark to compare. I personally really like the Claude in chrome agent but unfortunately I don't think I can build it into projects yet