frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Focus and Context and LLMs

https://taras.glek.net/posts/focus-and-context-and-llms/
42•tarasglek•10h ago

Comments

quantum_state•7h ago
Context is all you need :-)
tarasglek•7h ago
Indeed, that was my original working title
max2he•4h ago
bruh that's googles original working title
summarity•4h ago
I found the same in my personal work. I have o3 chats (as in OAI's Chat interface) that are so large they crash the site, yet o3 still responds without hallucination and can debug across 5k+ LOC. I've used it for DSP code, to debug a subtle error in a 800+LOC Nim macro that sat in a 4k+ LOC module (it found the bug), work on compute shaders for audio analysis, work on optimizing graphics programs and other algorithms. Once I "vibe coded" (I hate that term) a fun demo using a color management lib I wrote, which encoded the tape state for a brainfuck interpreter in the deltaE differences between adjacent cells. Using the same prompts replayed in Claude chat and others doesn't even get close. It's spooky.

Yet when I use the Codex CLI, or agent mode in any IDE it feels like o3 regresses to below GPT-3.5 performance. All recent agent-mode models seem completely overfitted to tool calling. The most laughable attempt is Mistral's devstral-small - allegedly the #1 agent model, but going outside of scenarios you'd encounter in SWEbench & co it completely falls apart.

I notice this at work as well, the more tools you give any model (reasoning or not), the more confused it gets. But the alternative is to stuff massive context into the prompts, and that has no ROI. There's a fine line to be walked here, but no one is even close it yet.

__mharrison__•4h ago
Building complex software is certainly possible with no coding and minimal promoting.

This YT video (from 2 days ago) demonstrates it https://youtu.be/fQL1A4WkuJk?si=7alp3O7uCHY7JB16

The author builds a drawing app in an hour.

emorning3•4h ago
The article summed itself up as 'Context is everything".

But the article itself also makes the point that a human assistant was also necessary. That's gonna be my take away.

artembugara•3h ago
What are some startups that help precisely with “feeding the LLM the right context” ?
Workaccount2•2h ago
I don't know why software engineers think that LLM coding ability is purpose made for them to use, and because it sort of sucks at it, it therefore useless...

It's like listening to professional translators endlessly lament about translation software and all it's short comings and pitfalls, while totally missing that the software is primarily used for property managers wanting to ask the landscapers to cut the grass lower.

LLMs are excellent at writing code for people who have no idea what a programming language is, but a good idea of what computers can do when someone can speak this code language to them. I don't need an LLM to one-shot Excel.exe so I can track the number of members vs non-members who come to my community craft fair.

jmward01•2h ago
This is definitely the right problem to focus on. I think the answer is a different LLM structure that has unlimited context. The transformer with causal masks for training block got us here but they are now limiting us in many massive ways.
briian•1h ago
The funny thing about vibe coding is that God tier vibe coders think they're in DGAF mode. But people who are actually in DGAF mode and just say "Make instagram for me" think they're in god tier.

But agreed, there needs to be a better way for these agents to figure out what context to select. It doesn't seem like this will be too much of a large issue to solve though?

tptacek•1h ago
This article is knocking down a very expansive claim that most serious (ie: not vibe-coding) developers aren't making. Their point is that LLM agents have not yet reached the point where they can finish a complicated job end-to-end, and that if you want to do a completely hands-off project, where only the LLM generates any code, it takes a lot of prompting effort to accomplish.

This seems true, right now!

But in building out stuff with LLMs, I don't expect (or want) them to do the job end-to-end. I've ~25 merged PRs into a project right now (out of ~40 PRs generated). Most merged PRs I pulled into Zed and cleaned something up. At around PR #10 I went in and significantly restructured the code.

The overall process has been much faster and more pleasant than writing from scratch, and, notably, did not involve me honing my LLM communications skills. The restructuring work I did was exactly the same kind of thing I do on all my projects; until you've got something working it's hard to see what the exact right shape is. I expect I'll do that 2-3 more times before the project is done.

I feel like Kenton Varda was trying to make a point in the way they drove their LLM agent; the point of that project was in part to record the 2025 experience of doing something complicated end-to-end with an agent. That took some doing. But you don't have to do that to get a lot of acceleration from LLMs.

jumploops•1h ago
Has the author tried Claude Code?

It’s the first useful “agent” (LLM in a loop + tools) that I’ve tried.

IME it is hard to explain why it’s better than e.g. Aider or Cursor, but once you try it you’ll migrate your workflow pretty quickly.

padolsey•49m ago
How much transparency does Claude Code give you into what it's doing? I like IDE-integrated agents as they show diffs and allow focused prompting for specific areas of concern. And I get to control what's in context at any given time in a longer thread. I haven't tried Claude's thing in a while, but from what I gather it's more of a "prompt and pray" kind of agent.. ?
_neil•8m ago
My experience is that you can be very targeted in your promoting with Claude code and it mostly gets good results. You can also ask it early on to create a branch and create logical commits as it works. This way, you can examine smaller code changes later in a PR (or git log).

Or if you want to work more manually, you could do the same but not allow full access to git commit. That way it will request access each time it’s ready to commit and you can take that time to review diffs.

apwell23•3m ago
> IME it is hard to explain why it’s better than e.g. Aider or Cursor

i have cursor through work but i am tempted to shell out $100 because of this hype.

is it better than using claude models in cursor?

troupo•1m ago
It can get surprisingly dumb surprisingly fast.

Today I spent easily half an hour trying to make it solve a layout issue it itself introduced when porting a component.

It was a complex port it executed perfectly. And then it completely failed to even create a simple wrapper that fixed a flexbox issue.

BTW. Claude (Code and Cursor) is over-indexed on "let's randomly add and remove h-full/overflow-auto and pretend it works ad infinitum"

Nginx Restic Back End

https://www.grepular.com/Nginx_Restic_Backend
20•mike-cardwell•3h ago•1 comments

Show HN: Let’s Bend – Open-Source Harmonica Bending Trainer

https://letsbend.de
29•egdels•3h ago•5 comments

How Compiler Explorer Works in 2025

https://xania.org/202506/how-compiler-explorer-works
18•vitaut•4d ago•0 comments

Gaussian Integration Is Cool

https://rohangautam.github.io/blog/chebyshev_gauss/
103•beansbeansbeans•10h ago•26 comments

Binfmtc – binfmt_misc C scripting interface

https://www.netfort.gr.jp/~dancer/software/binfmtc.html.en
60•todsacerdoti•6h ago•14 comments

Administering immunotherapy in the morning seems to matter. Why?

https://www.owlposting.com/p/the-time-of-day-that-immunotherapy
40•abhishaike•2h ago•45 comments

The last six months in LLMs, illustrated by pelicans on bicycles

https://simonwillison.net/2025/Jun/6/six-months-in-llms/
547•swyx•11h ago•159 comments

Joining Apple Computer (2018)

https://www.folklore.org/Joining_Apple_Computer.html
368•tosh•22h ago•98 comments

Cheap yet ultrapure titanium might enable widespread use in industry (2024)

https://phys.org/news/2024-06-cheap-ultrapure-titanium-metal-enable.amp
10•westurner•3d ago•1 comments

<Blink> and <Marquee> (2020)

https://danq.me/2020/11/11/blink-and-marquee/
170•ghssds•14h ago•143 comments

Efficient mRNA delivery to resting T cells to reverse HIV latency

https://www.nature.com/articles/s41467-025-60001-2
23•matthewmacleod•3d ago•5 comments

Bill Atkinson has died

https://daringfireball.net/linked/2025/06/07/bill-atkinson-rip
1518•romanhn•1d ago•257 comments

Self-Host and Tech Independence: The Joy of Building Your Own

https://www.ssp.sh/blog/self-host-self-independence/
366•articsputnik•1d ago•179 comments

Ask HN: How to learn CUDA to professional level

145•upmind•8h ago•55 comments

Convert photos to Atkinson dithering

https://gazs.github.io/canvas-atkinson-dither/
402•nvahalik•22h ago•46 comments

My experiment living in a tent in Hong Kong's jungle

https://corentin.trebaol.com/Blog/8.+The+Homelessness+Experiment
425•5mv2•1d ago•189 comments

Focus and Context and LLMs

https://taras.glek.net/posts/focus-and-context-and-llms/
42•tarasglek•10h ago•16 comments

Coventry Very Light Rail

https://www.coventry.gov.uk/coventry-light-rail
160•Kaibeezy•21h ago•216 comments

Field Notes from Shipping Real Code with Claude

https://diwank.space/field-notes-from-shipping-real-code-with-claude
205•diwank•1d ago•70 comments

Knowledge Management in the Age of AI

https://ericgardner.info/notes/knowledge-management-june-2025
98•katabasis•15h ago•60 comments

Launching the BeOS on Hitachi Flora Prius Systems (1999)

http://testou.free.fr/www.beatjapan.org/mirror/www.be.com/support/guides/hitachi_boot.html
4•doener•4h ago•1 comments

BorgBackup 2 has no server-side append-only anymore

https://github.com/borgbackup/borg/pull/8798
168•jaegerma•1d ago•100 comments

The printer that transcends dimensions and corrupts reality

https://ghuntley.com/ideas/
25•ghuntley•2h ago•7 comments

What was Radiant AI, anyway?

https://blog.paavo.me/radiant-ai/
203•paavohtl•1d ago•110 comments

Why We're Moving on from Nix

https://blog.railway.com/p/introducing-railpack
251•mooreds•1d ago•117 comments

Low-Level Optimization with Zig

https://alloc.dev/2025/06/07/zig_optimization
282•Retro_Dev•1d ago•175 comments

A tool for burning visible pictures on a compact disc surface (2022)

https://github.com/arduinocelentano/cdimage
182•carlesfe•1d ago•54 comments

Fray: A Controlled Concurrency Testing Framework for the JVM

https://github.com/cmu-pasta/fray
62•0x54MUR41•12h ago•3 comments

Discovering a JDK Race Condition, and Debugging It in 30 Minutes with Fray

https://aoli.al/blogs/jdk-bug/
130•aoli-al•1d ago•33 comments

The time bomb in the tax code that's fueling mass tech layoffs

https://qz.com/tech-layoffs-tax-code-trump-section-174-microsoft-meta-1851783502
1449•booleanbetrayal•4d ago•892 comments