fp.

Surprising to see so little traction on this; I hope this makes it to the second chance pool because it would interesting to hear the LLM advocates' take on it.

Is the poor performance because the LLMs are not being used for iterative refinement?

andawaywego•2w ago

Main issues I could find through skimming is conflating chat performance (sometimes very clearly tool-less) with agent performance, not allowing the AI to self organize on a cross-video consistent repo, not providing any persisted feedback, and a little nitpickedly going through providing motivating material in such a weird and inconsistent way sometimes with links instead of downloaded files and such.

In some cases even for the agent examples I just have to assume that the AI encountered some issue applying tooling and was forced to run in text mode throughout? Unfortunately there seems to be so much missing context for the viewer of what the assignment, process, expected and resulting output are that you can only really guess at what's going on from the most outwardly bewildering (to the OP) behaviour.

Was going to share my work

Pitchfork: A devilishly good process manager for developers

You Are Here

Why social apps need to become proactive, not reactive

How patient are AI scrapers, anyway? – Random Thoughts

Vouch: A contributor trust management system

I built a terminal monitoring app and custom firmware for a clock with Claude

Tiny C Compiler

Y Combinator Founder Organizes 'March for Billionaires'

Ask HN: Need feedback on the idea I'm working on

OpenClaw Addresses Security Risks

Apple finalizes Gemini / Siri deal

Italy Railways Sabotaged

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

Nintendo Wii Themed Portfolio

"There must be something like the opposite of suicide "

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use