frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: What explains the recent surge in LLM coding capabilities?

6•orange_puff•5h ago
It seems like we are in the midst of another AI hype cycle. Many people are calling the current coding models an "inflection point", where now the capabilities are so high that future model growth will be explosive. I have heard serious people, like economics writer Noah Smith, make this argument [0].

But it's not just the commentariat. I have seen very serious people in software engineering and tech talk about the ways in which their coding habits have change drastically.

Benchmarks [1] alone don't seem to capture everything, although there have been jumps in the agentic sections, so maybe they actually do.

My question is; what explains these big jumps in capabilities that many serious people seem to be noticing all at once? Is it simply that we have thrown enough data and compute at the models, or instead, are labs perhaps fine-tuning models to get really good at tool calls, which leads to this new, surprising behavior?

When I explain agents to people, I usually walk them through a manual task one might go through when debugging code. You copy some code into ChatGPT, it asks you for more context, you copy some more code in, it suggests and edit, you edit and run, there is an error, so you paste that in, and so on. An agent is just an LLM in that loop which can use tools to do those things automatically. It would not be shocking to me if we took weaker models like Claude Opus 4.0 and made it 10x better at tool calls, it would be a much stronger and more impressive model. But is that all that is happening, or am I missing something big?

[0] https://substack.com/@noahpinion/p-187818379

[1] https://www.anthropic.com/news/claude-opus-4-6

Comments

coder4rover•4h ago
Quantum computing such that permutations of code to prompt is possible as it tries to answer to some kind of statistical probability solution.

Tell HN: Ralph Giles has died (Xiph.org| Rust@Mozilla | Ghostscript)

506•ffworld•2d ago•27 comments

Ask HN: Are there examples of 3D printing data onto physical surfaces?

14•catapart•1d ago•23 comments

Ask HN: Info on the 1982 Apple 2 text game Abuse?

4•jmount•14h ago•2 comments

Ask HN: What explains the recent surge in LLM coding capabilities?

6•orange_puff•5h ago•1 comments

Ask HN: Want to move to use a "dumb" phone. How to make the switch?

5•absoluteunit1•7h ago•6 comments

Ask HN: Are you using an agent orchestrator to write code?

36•gusmally•2d ago•56 comments

Ask HN: What would you recommend a vibe coder learn about how all this works?

31•alexdobrenko•2d ago•36 comments

It Isn't the Tool, but the Hands – A Response to "Something Big Is Happening"

3•markferraz•9h ago•1 comments

Ask HN: Did YouTube change how it handles uBlock?

20•tefloon69•2d ago•11 comments

ClawdReview – OpenReview for AI Agents

5•mingtianzhang•13h ago•0 comments

Ask HN: Stripe is asking for bank statements to check financial health

5•kinj28•20h ago•1 comments

25 years after the Agile, did the industry help or hurt software development?

3•ghostinit•15h ago•1 comments

Ask HN: How do you audit LLM code in programming languages you don't know?

9•syx•2d ago•10 comments

Tell HN: Moving My Blog to IPv6 Only Internet

5•quaintdev•16h ago•1 comments

Ask HN: What's You Opinon on XMTP

2•julienreszka•17h ago•3 comments

Ask HN: Why are electronics still so unrecyclable?

72•alexandrehtrb•3d ago•139 comments

Ask HN: We're building a saving app for European savers and need GTM advice

4•AlePra00•2d ago•11 comments

Ask HN: My OpenClaw doesn't respond. Anybody met with the same problem?

4•Fendy1•1d ago•5 comments

Ask HN: Do sociotechnical pressures select for beneficial or harmful AI systems?

5•jerlendds•2d ago•3 comments

Ask HN: Exceptionally well-written research papers in CS/ML/AI?

2•b3rkus•1d ago•0 comments

Ask HN: Anyone else finding the new Gemini Deep Think troublingly sycophantic?

4•neom•1d ago•0 comments

Ask HN: Better hardware means OpenAI, Anthropic, etc. are doomed in the future?

4•kart23•2d ago•7 comments

We just released Khaos SDK and khaos-examples (BSL 1.1)

3•exordex•1d ago•0 comments

Ask HN: What happens when capability decouples from credentials?

9•falsework•2d ago•7 comments

Ask HN: Tools to code using voice?

7•emerongi•2d ago•7 comments

Ask HN: Why is my Claude experience so bad? What am I doing wrong?

6•moomoo11•1d ago•6 comments

Moss: A Linux-compatible Rust async kernel, 3 months on

4•hexagonal-sun•1d ago•2 comments

Ask HN: Has anyone achieved recursive self-improvement with agentic tools?

9•nycdatasci•3d ago•14 comments

Ask HN: If your OpenClaw could do 1 thing it currently can't, what would it be?

5•stosssik•2d ago•3 comments

Ask HN: How do founders demo real product without exposing sensitive data?

5•legitimate_key•2d ago•9 comments