frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Two things LLM coding agents are still bad at

https://kix.dev/two-things-llm-coding-agents-are-still-bad-at/
60•kixpanganiban•2h ago

Comments

davydm•2h ago
Coding and...?
drdeca•1h ago
More granular. What things is it bad at that result in it being overall “bad at coding”? It isn’t all of the parts.
Black616Angel•1h ago
Copy and pasting.

Oh, sorry. You already said that. :D

baq•1h ago
they're getting better at asking questions; I routinely see search calls against the code base index. they just don't ask me questions.
IanCal•1h ago
Editing tools are easy to add it’s just you have to pick what things to give them because too many and they struggle as it uses up a lot of context. Still, as costs come down multiple steps to look for tools becomes cheaper too.

I’d like to see what happens with better refactoring tools, I’d make a bunch more mistakes copying and retyping or using awk. If they want to rename something they should be able to use the same tooling the rest of us get.

Asking questions is a good point but that’s both a bit of promoting and I think the move to having more parallel work makes it less relevant. One of the reasons clarifying things more upfront is useful is we take a lot of time and cost a lot of money to build things so the economics favours getting it right first time. As the time comes down and the cost drops to near zero, the balance changes.

There are also other approaches to clarify more what you want and how to do it first, breaking that down into tasks, then letting it run with those (spec kit). This is an interesting area.

ra•1h ago
IaC, and DSLs in general.
nikanj•1h ago
4/5 times when Claude is looking for a file, it starts by running bash(dir c:\test /b)

First it gets an error because bash doesn’t understand \

Then it gets an error because /b doesn’t work

And as LLMs don’t learn from their mistakes, it always spends at least half a dozen tries (e.g. bash(cmd.exe /c dir c:\test /b )) before it figures out how to list files

If it was an actual coworker, we’d send it off to HR

anonzzzies•53m ago
I have a list of those things in CLAUDE.md -> it seems to help (unless it's context is full, but you should never let it get close really).
cheema33•24m ago
Most models struggle in a Windows environment. They are trained on a lot of Unixy commands and not as much on Windows and PowerShell commands. It was frustrating enough that I started using WSL for development when using Windows. That helped me significantly.

I am guessing this because:

1. Most of the training material online references Unix commands. 2. Most Windows devs are used to GUIs for development using Visual Studio etc. GUIs are not as easy to train on.

Side note: Interesting thing I have noticed in my own org is that devs with Windows background strictly use GUIs for git. The rest are comfortable with using git from the command line.

rconti•1h ago
Doing hard things that aren't greenfield? Basically any difficult and slightly obscure question I get stuck with and hope the collective wisdom of the internet can solve?
athrowaway3z•29m ago
You don't learn new languages/paradigms/frameworks by inserting it into an existing project.

LLMs are especially tricky because they do appear to work magic on a small greenfield, and the majority of people are doing clown-engineering.

But I think some people are underestimating what can be done in larger projects if you do everything right (eg docs, tests, comments, tools) and take time to plan.

zjaffee•1h ago
I saw this happen just yesterday where I was doing a decent sized refactor of some configuration files where it was deleting comments on the line above a refactor in some files but not others. Honestly this was easy to fix, but it also would've been easier if it didn't happen in the first place.
koliber•1h ago
Most developers are also bad at asking questions. They tend to assume too many things from the start.

In my 25 years of software development I could apply the second critique to over half of the developers I knew. That includes myself for about half of that career.

rkomorn•1h ago
But, just like lots of people expect/want self-driving to outperform humans even on edge cases in order to trust them, they also want "AI" to outperform humans in order to trust it.

So: "humans are bad at this too" doesn't have much weight (for people with that mindset).

It makes sense to me, at least.

darkwater•54m ago
If we had a knife that most of the time cuts a slice of bread like the bottom p50 of humans cutting a slice of bread with their hands, we wouldn't call the knife useful.

Ok, this example is probably too extreme, replace the knife with an industrial machine that cut bread vs a human with a knife. Nobody would buy that machine either if it worked like that.

rkomorn•47m ago
I feel kind of attacked for my sub-p50 bread slicing skills, TBH. :(
Certhas•23m ago
I think this is still too extreme. A machine that cuts and preps food at the same level as a 25th percentile person _being paid to do so_, while also being significantly cheaper would presumably be highly relevant.
rkomorn•13m ago
Aw man. There are so many angles though.

Your p25 employee is probably much closer to your p95 employee than to the p50 "standard" human, so yeah, I think you have a point there.

But at least in food prep, p25 would already be pretty damn hard to achieve. That's a hell of a lot of autonomy and accuracy (at least in my restaurant kitchen experience which is admittedly just one year in "fine dining"-ish kitchens).

I'd say the p25 of software or SRE folks I've worked with is also a pretty high bar to hit, too, but maybe I've been lucky.

koliber•19m ago
Agreed in a general sense, but there's a bit more nuance.

If a knife slices bread like a normal human at p50, it's not a very good knife.

If a knife slices bread like a professional chef at p50, it's probably a very decent knife.

I don't know if LLMs are better at asking questions than a p50 developer. In my original comment I wanted to raise the question of whether the fact that LLMs are not good at asking questions makes them still worse than human devs.

The first LLM critique in the original article is that they can't copy and paste. I can't argue with that. My 12 year old copies-and-pastes better than top coding agents.

The second critique says they can't ask questions. Since many developers also are not good at this, how does the current state of the art LLM compare to a p50 developer in this regard?

AllegedAlec•59m ago
On a more important level, I found that they still do really badly at even a minorly complex task without extreme babysitting.

I wanted it to refactor a parser in a small project (2.5K lines total) because it'd gotten a bit too interconnected. It made a plan, which looked reasonable, so I told it to do this in stages, with checkpoints. It said it'd done so. I asked it "so is the old architecture also removed?" "No, it has not been removed." "Is the new structured used in place of the old one?" "No, it has not." After it did so, 80% of the test suite failed because nothing it'd written was actually right.

Did so three times with increasingly more babysitting, but it failed at the abstract task of "refactor this" no matter what with pretty much the same failure mode. I feel like I have to tell it exactly to make changes X and Y to class Z, remove class A etc etc, at which point I can't let it do stuff unsupervised, which is half of the reason for letting an LLM do this in the first place.

hu3•56m ago
Interesting. What model and tool was used?

I have seen similar failure modes in Cursor and VSCode Copilot (using gpt5) where I have to babysit relatively small refactors.

AllegedAlec•53m ago
Claude code. Whichever model it started up automatically last weekend, I didn't explicitly check.
rglynn•49m ago
This feels like a classic Sonnet issue. From my experience, Opus or GPT-5-high are less likely to do the "narrow instruction following without making sensible wider decisions based on context" than Sonnet.
habibur•33m ago
Might be related with what the article was talking. AI can't cut-paste. It deletes the code and then regenerates it at another location instead of cut-paste.

Obviously generated code drift a little from deleted ones.

jeswin•5m ago
> I wanted it to refactor a parser in a small project

This expression tree parser (typescript to sql query builder - https://tinqerjs.org/) has zero lines of hand-written code. It was made with Codex + Claude over two weeks (part-time on the side). Having worked on ORMs previously, it would have taken me 4x-10x the time to get to the same state (which also has 100s of tests, with some repetitions). That's a massive saving in time.

I did not have to baby sit the LLMs at all. So the answer is, I think it depends on what you use it for, and how you use it. Like every tool, it takes a really long time to find a process that works for you. In my conversations with other developers who use LLMs extensively, they all have their unique, custom processes. All of them however do focus on test suites, documentation, and method review processes.

hu3•58m ago
I have seen LLMs in VSCode Copilot ask to execute 'mv oldfile.py newfile.py'.

So there's hope.

But often they just delete and recreate the file, indeed.

schiho•58m ago
I just run into this issue with claude sonet 4.5, asked it to copy/paste some constants from one file to another, a bigger chunk of code, it instead "extracted" pieces and named them so. As a last resort, after going back and forth it agreed to do a file/copy by running a system command. I was surprised that of all the programming tasks, a copy/paste felt challenging for the agent.
tjansen•51m ago
I guess the LLMs are trained to know what finished code looks like. They don't really know the operations a human would use to get there.
tjansen•54m ago
Agreed with the points in that article, but IMHO the no 1 issue is that agents only see a fraction of the code repository. They don't know whether there is a helper function they could use, so they re-implement it. When contributing to UIs, they can't check the whole UI to identify common design patterns, so they re-invent it.

The most important task for the human using the agent is to provide the right context. "Look at this file for helper functions", "do it like that implementation", "read this doc to understand how to do it"... you can get very far with agents when you provide them with the right context.

(BTW another issue is that they have problems navigating the directory structure in a large mono repo. When the agents needs to run commands like 'npm test' in a sub-directory, they almost never get it right the first time)

Vipsy•53m ago
Coding agents tend to assume that the development environment is static and predictable, but real codebases are full of subtle, moving parts - tooling versions, custom scripts, CI quirks, and non-standard file layouts.

Many agents break down not because the code is too complex, but because invisible, “boring” infrastructure details trip them up. Human developers subconsciously navigate these pitfalls using tribal memory and accumulated hacks, but agents bluff through them until confronted by an edge case. This is why even trivial tasks intermittently fail with automation agents. you’re fighting not logic errors, but mismatches with the real lived context. Upgrading this context-awareness would be a genuine step change.

pimeys•38m ago
Yep. One of the things I've found agents always having a lot of trouble with is anything related to OpenTelemetry. There's a thing you call that uses some global somewhere, there's a docker container or two and there's the timing issues. It takes multiple tries to get anything right. Of course this is hard for a human too if you haven't used otel before...
throw-10-8•52m ago
3. Saying no

LLMs will gladly go along with bad ideas that any reasonable dev would shoot down.

nxpnsv•36m ago
Agree, this is really bad.
throw-10-8•34m ago
It's a fundamental failing of trying to use a statistical approximation of human language to generate code.

You can't fix it.

pimeys•33m ago
I've found codex to be better here than Claude. It has stopped many times and said hey you might be wrong. Of course this changes with a larger context.

Claude is just chirping away "You're absolutely right" and making me to turn on caps lock when I talk to it and it's not even noon yet.

throw-10-8•28m ago
i find the chirpy affirmative tone of claude to be rage inducing
pimeys•16m ago
This. The biggest reason I went with OpenAI this month...
giancarlostoro•50m ago
Point #2 cracks me up because I do see with JetBrains AI (no fault of JetBrains mind you) the model updates the file, and sometimes I somehow wind up with like a few build errors, or other times like 90% of the file is now build errors. Hey what? Did you not run some sort of what if?
the_mitsuhiko•49m ago
> LLMs don’t copy-paste (or cut and paste) code. For instance, when you ask them to refactor a big file into smaller ones, they’ll "remember" a block or slice of code, use a delete tool on the old file, and then a write tool to spit out the extracted code from memory. There are no real cut or paste tools. Every tweak is just them emitting write commands from memory. This feels weird because, as humans, we lean on copy-paste all the time.

There is not that much copy/paste that happens as part of refactoring so it leans to just using context recall. It's not entirely clear if providing an actual copy/paste command is particularly useful, at least from my testing it does not do much. More interesting are repetitive changes that clog up the context. Those you can improve on if you have `fastmod` or some similar tool available: with it you can instruct codex or claude to perform edits with it.

> And it’s not just how they handle code movement -- their whole approach to problem-solving feels alien too.

It is, but if you go back and forth to work out a plan for how to solve the problem, then the approach greatly changes.

3abiton•46m ago
I think copy/paste can alleviate context explosion. Basically the model can remember what's the code block contain, can access it at any time, without needing to "remember" it.
brianpan•26m ago
How is it not clear that it would be beneficial?

To use another example, with my IDE I can change a signature or rename something across multiple files basically instantly. But an LLM agent will take multiple minutes to do the same thing and doesn't get it right.

the_mitsuhiko•7m ago
> How is it not clear that it would be beneficial?

There is reinforcement learning on the Anthropic side for a text edit tool, which is built in a way that does not lend itself to copy/paste. If you use a model like the GPT series then there might not be reinforcement learning for text editing (I believe, I don't really know), but it operates on line-based replacements for the most part and for it to understand what to manipulate it needs to know the content in the context. When you try to give it a copy/paste buffer it does not fully comprehend what the change in the file looks like after the operation.

So it might be possible to do something with copy/paste, but I did not find it to be very obvious how you make that work with an agent, given that it needs to read the file into context anyways and its recall capabilities are surprisingly good.

> To use another example, with my IDE I can change a signature or rename something across multiple files basically instantly.

So yeah, that's the more interesting case and there things like codemod/fastmod are very effective if you tell an agent to use it. They just don't reach there.

sxp•45m ago
Another place where LLMs have a problem is when you ask them to do something that can't be done via duct taping a bunch of Stack Overflow posts together. E.g, I've been vibe coding in Typescript on Deno recently. For various reasons, I didn't want to use the standard Express + Node stack which is what most LLMs seem to prefer for web apps. So I ran into issues with Replit and Gemini failing to handle the subtle differences between node and deno when it comes to serving HTTP requests.

LLMs also have trouble figuring out that a task is impossible. I wanted boilerplate code that rendered a mesh in Three.js using GL_TRIANGLE_STRIP because I was writing a custom shader and needed to experiment with the math. But Three.js does support GL_TRIANGLE_STRIP rendering for architectural reasons. Grok, ChatGPT, and Gemini all hallucinated a GL_TRIANGLE_STRIP rendering API rather than telling be about this and I had to Google the problem myself.

It feels like current Coding LLMs are good at replacing junior engineers when it comes to shallow but broad tasks like creating UIs, modifying examples available on the web, etc. But they fail at senior-level tasks like realizing that the requirements being asked of them aren't valid and doing something that no one has done in their corpus of training data.

athrowaway3z•42m ago
>But Three.js does support GL_TRIANGLE_STRIP rendering for architectural reasons.

Typo or trolling the next LLM to index HN comments?

ziotom78•43m ago
I fully resonate with point #2. A few days ago, I was stuck trying to implement some feature in a C++ library, so I used ChatGPT for brainstorming.

ChatGPT proposed a few ideas, all apparently reasonable, and then it advocated for one that was presented unambiguously as the "best". After a few iterations, I realized that its solution would have required a class hierarchy where the base class contained a templated virtual function, which is not allowed in C++. I pointed this out to ChatGPT and asked it to rethink the solution; it then immediately advocated for the other approach it had initially suggested.

freetonik•37m ago
I see a pattern in these discussions all the time: some people say how very, very good LLMs are, and others say how LLMs fail miserably; almost always the first group presents examples of simple CRUD apps, frontend "represent data using some JS-framework" kind of tasks, while the second group presents examples of non-trivial refactoring, stuff like parsers (in this thread), algorithms that can't be found in leetcode, etc.

Tech twitter keeps showing "one-shotting full-stack apps" or "games", and it's always something extremely banal. It's impressive that a computer can do it on its own, don't get me wrong, but it was trivial to programmers, and now it is commoditized.

quietbritishjim•30m ago
Yesterday, I got Claude Code to make a script that tried out different point clustering algorithms and visualise them. It made the odd mistake, which it then corrected with help, but broadly speaking it was amazing. It would've taken me at least a week to write by hsnd, maybe longer. It was writing the algorithms itself, definitely not just simple CRUD stuff.
freetonik•25m ago
I also got good results for “above CRUD” stuff occasionally. Sorry if I wasn’t clear, I meant to primarily share an observation about vastly different responses in discussions related to LLMs. I don’t believe LLMs are completely useless for non-trivial stuff, nor I believe that they won’t get better. Even those two problems in the linked article: sure, those actions are inherently alien to the LLM’s structure itself, but can be solved with augmentation.
piva00•8m ago
In my experience it's been great to have LLMs for narrowly-scoped tasks, things I know how I'd implement (or at least start implementing) but that would be tedious to manually do, prompting it with increasingly higher complexity does work better than I expected for these narrow tasks.

Whenever I've attempted to actually do the whole "agentic coding" by giving it a complex task, breaking it down in sub-tasks, loading up context, reworking the plan file when something goes awry, trying again, etc. it hasn't a single fucking time done the thing it was supposed to do to completion, requiring a lot of manual reviewing, backtracking, nudging, it becomes more exhausting than just doing most of the work myself, and pushing the LLM to do the tedious work.

It does work sometimes to use for analysis, and asking it to suggest changes with the reasoning but not implement them, since most times when I let it try to implement its broad suggestions it went haywire, requiring me to pull back, and restart.

There's a fine line to walk, and I only see comments on the extremes online, it's either "I let 80 agents running and they build my whole company's code" or "they fail miserably on every task harder than a CRUD". I tend to not believe in either extreme, at least not for the kinds of projects I work on which require more context than I could ever fit properly beforehand to these robots.

NitpickLawyer•20m ago
> almost always the first group presents examples of simple CRUD apps

How about a full programming language written by cc "in a loop" in ~3 months? With a compiler and stuff?

https://cursed-lang.org/

It might be a meme project, but it's still impressive as hell we're here.

I learned about this from a yt content creator that took that repo, asked cc to "make it so that variables can be emojis", and cc did that 5$ later. Pretty cool.

sidgtm•36m ago
As a UX designer I see they lack the ability of being opinionated about a design piece and go with the standard mental model. I got fed up with this and made a simple java script code to run a simple canvas on the localhost to pass on more subjective feedback using highlights and notes feature. I tried using playwright first but a. its token heavy b. it's still for finding what's working or breaking instead of thinking deeply about the design.
rossant•35m ago
Recently, I asked Codex CLI to refactor some HTML files. It didn't literally copy and pasted snippets here and there as I would have done myself, it rewrote them from memory, removing comments in the process. There was a section with 40 successive <a href...> links with complex URLs.

A few days later, just before deployment to production, I wanted to double check all 40 links. First one worked. Second one worked. Third one worked. Fourth one worked. So far so good. Then I tried the last four. Perfect.

Just to be sure, I proceeded with the fifth one. 404. Huh. Weird. The domain was correct though and the URL seemed reasonable.

I tried the other 31 links. ALL of them 404ed. I was totally confused. The domain was always correct. It seemed highly suspicious that all websites would have had moved internal URLs at the same time. I didn't even remember that this part of the code had gone through an LLM.

Fortunately, I could retrieve the old URLs on old git commits. I checked the URLs carefully. The LLM had HALLUCINATED most of the path part of the URLs! Replacing things like domain.com/this-article-is-about-foobar-123456/ by domain.com/foobar-is-so-great-162543/...

These kinds of very subtle and silently introduced mistakes are quite dangerous. Be careful out there!

ivape•20m ago
You’re just not using LLMs enough. You can never trust the LLM to generate a url, and this was known over two years ago. It takes one token hallucination to fuck up a url.

It’s very good at a fuzzy great answer, not a precise one. You have to really use this thing all the time and pick up on stuff like that.

doikor•18m ago
I would generalise it to you can’t trust LLMs to generate any kind of unique identifier. Sooner or later it will hallucinate a fake one.
grey-area•15m ago
Or just not bother. It sounds pretty useless if it flunks on basic tasks like this.

Perhaps you’ve been sold a lie?

ivape•9m ago
Well, you see it hallucinates on long precise strings, but if we ignore that, and focus on what it’s powerful at, we can do something powerful. In this case, by the time it gets to outputting the url, it already determined the correct intent or next action (print out a url). You use this intent to do a tool call to generate a url.

You have to be able to see what this thing can actually do, as opposed to what it can’t.

worldsayshi•5m ago
This is of course bad but: humans also makes (different) mistakes all the time. We could account for the risk of mistakes being introduced and make more tools that validate things for us. In a way LLM:s encourage us to do this by adding other vectors of chaos into our work.

Like, why not have tools built into our environment that checks that links are not broken? With the right architecture we could have validations for most common mistakes without having the solution adding a bunch of overhead.

juped•34m ago
It's apparently lese-Copilot to suggest this these days, but you can find very good hypothesizing and problem solving if you talk conversationally to Claude or probably any of its friends that isn't the terminally personality-collapsed SlopGPT (with or without showing it code, or diagrams); it's actually what they're best at, and often they're even less likely than human interlocutors to just parrot some set phrase at you.

It's only when you take the tech out of the area it's good at and start trying to get it to "write code" or even worse "be an agent" that it starts cracking up and emitting garbage; this is only done because companies want to forcememe some kind of product besides "chatbot", whether or not it makes sense. It's a shame because it'll happily and effectively write the docs that don't exist but you wish did for more or less anything. (Writing code examples for docs is not a weak point at all.)

cat-whisperer•26m ago
The copy-paste thing is interesting because it hints at a deeper issue: LLMs don't have a concept of "identity" for code blocks—they just regenerate from learned patterns. I've noticed similar vibes when agents refactor—they'll confidently rewrite a chunk and introduce subtle bugs (formatting, whitespace, comments) that copy-paste would've preserved. The "no questions" problem feels more solvable with better prompting/tooling though, like explicitly rewarding clarification in RLHF.
stellalo•11m ago
I feel like it’s the opposite: the copy-paste issue is solvable, you just need to equip the model with the right tools and make sure they are trained on tasks where that’s unambiguously the right thing to do (for example, cases were copying code “by hand” would be extremely error prone -> leads to lower reward on average).

On the other hand, teaching the model to be unsure and ask questions, requires the training loop to break and bring a human input in, which appears more difficult to scale.

nxpnsv•26m ago
Codex has got me a few times lately, doing what I asked but certainly not what I intended:

- Get rid of these warnings "...": captures and silences warnings instead of fixing them - Update this unit test to relfect the changes "...": changes the code so the outdated test works - The argument passed is now wrong: catches the exception instead of fixing the argument

My advice is to prefer small changes and read everything it does before accepting anything, often this means using the agent actually is slower than just coding...

pammf•22m ago
In Claude Code, it always shows the diff between current and proposed changes and I have to explicitly allow it to actually modify the code. Doesn’t that “fix” the copy-&-paste issue?
SafeDusk•17m ago
@kixpanganiban Do you think it will work if for refactoring tasks, we take aways OpenAI's `apply_patch` tool and just provide `cut` and `paste` for the first few steps?

I can run this experiment using ToolKami[0] framework if there is enough interest or if someone can give some insights.

[0]: https://github.com/aperoc/toolkami

bad_username•17m ago
LLMs are great at asking questions if you ask them to ask questions. Try it: "before writing the code, ask me about anything that is nuclear or ambiguous about the task".
d1sxeyes•13m ago
“If you think I’m asking you to split atoms, you’re probably wrong”.
senko•14m ago
I'd argue LLM coding agents are still bad at many more things. But to comment on the two problems raised in the post:

> LLMs don’t copy-paste (or cut and paste) code.

The article is confusing the architectural layers of AI coding agents. It's easy to add "cut/copy/paste" tools to the AI system if that shows improvement. This has nothing to do with LLM, it's in the layer on top.

> Good human developers always pause to ask before making big changes or when they’re unsure [LLMs] keep trying to make it work until they hit a wall -- and then they just keep banging their head against it.

Agreed - LLMs don't know how to back track. The recent (past year) improvements in thinking/reasoning do improve in this regard (it's the whole "but wait..." RL training that exploded with OpenAI o1/o3 and DeepSeek R1, now done by everyone), but clearly there's still work to do.

clayliu•13m ago
“They’re still more like weird, overconfident interns.” Perfect summary. LLMs can emit code fast but they don’t really handle code like developers do — there’s no sense of spatial manipulation, no memory of where things live, no questions asked before moving stuff around. Until they can “copy-paste” both code and context with intent, they’ll stay great at producing snippets and terrible at collaborating.

Operating a private 5G station in Japan

https://www.justus.pw/posts/2025-10-09-private-5g-in-japan.html
1•furkansahin•47s ago•0 comments

Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

https://huggingface.co/papers/2510.04212
1•hack_new•2m ago•0 comments

The React Foundation: The New Home for React and React Native

https://engineering.fb.com/2025/10/07/open-source/introducing-the-react-foundation-the-new-home-f...
1•DanielHB•2m ago•0 comments

Discover SilentSalt, your destination for fast, free, and unblocked online games

https://silentsalt.site
1•candseven•3m ago•0 comments

Fully Integrated 2D Flash Chip Unveiled

https://bioengineer.org/fully-integrated-2d-flash-chip-unveiled/
2•limoce•4m ago•0 comments

Amazon Echo Show 5 (2019) bootloader unlock

https://twitter.com/r0rt1z2/status/1976055700259238396
1•denysvitali•18m ago•0 comments

Man gets drunk, wakes up with a medical mystery that nearly kills him

https://arstechnica.com/health/2025/10/man-gets-drunk-wakes-up-with-a-medical-mystery-that-nearly...
2•sipofwater•20m ago•0 comments

Mastercard Developers Agent Toolkit

https://github.com/Mastercard/developers-agent-toolkit
1•saikatsg•22m ago•0 comments

A series of debugging sessions for Strimzi

https://github.com/fvaleri/strimzi-debugging
1•fvaleri•23m ago•0 comments

Lesser-known science fiction movies

https://rakhim.exotext.com/lesser-known-sci-fi-movies
1•freetonik•23m ago•0 comments

Connecting AgentKit and Agent Builder to Your MCPs

https://go.mcptotal.io/blog/agentkit-connected-to-mcptotal
1•agentictime•26m ago•0 comments

Fresh Takes on AI's Wild Ride in Late 2025

1•travors•28m ago•0 comments

All the hostages are coming home

https://www.cbsnews.com/news/israel-hamas-deal-release-all-hostages-gaza-trump-says/
2•ukblewis•29m ago•0 comments

Image to URL Converter

https://imagetourl.net
1•Rarpr716•29m ago•0 comments

Att&Df: Update the Operating System's "Dead Drop"

https://zenodo.org/records/17301184
1•thevieart•30m ago•1 comments

Loops of DNA Equipped Ancient Life to Become Complex

https://www.quantamagazine.org/loops-of-dna-equipped-ancient-life-to-become-complex-20251008/
1•pykello•30m ago•0 comments

The Lost Art of Semaphores

https://aivarsk.com/2025/10/09/the-lost-art-of-semaphores/
2•aivarsk•34m ago•0 comments

The words "blah blah blah" increase AI accuracy

https://medium.com/the-generator/verbosity-boosts-ai-accuracy-not-chain-of-thought-798d31f792c5
2•cmsefton•35m ago•1 comments

PowerToys has detected an app Always On Top message · Issue #30973

https://github.com/microsoft/PowerToys/issues/30973
1•stareatgoats•40m ago•0 comments

Jensen Huang says H-1B changes would've prevented his family from immigrating

https://www.cnbc.com/2025/10/08/jensen-huang-h1b-immigration-trump.html
4•belter•41m ago•3 comments

Satanic panic – how Dublin's Hellfire Club inspired a new video game

https://www.rte.ie/culture/2025/1009/1536327-how-dublins-hellfire-club-inspired-a-new-video-game/
3•austinallegro•46m ago•0 comments

The Indian messaging app that wants to take on WhatsApp

https://www.bbc.com/news/articles/cy50299w5vwo
2•vinni2•46m ago•0 comments

Free Tools for YouTube Channels

https://utubekit.com/
2•ayushchat•48m ago•0 comments

Redis Security Advisory: CVE-2025-49844

https://redis.io/blog/security-advisory-cve-2025-49844/
2•StefanBatory•49m ago•0 comments

AI Visibility Drift: The Quiet Collapse Between Retrains

https://www.aivojournal.org/visibility-drift-the-quiet-collapse-between-retrains/
1•businessmate•49m ago•0 comments

The Unknotting Number Is Not Additive

https://divisbyzero.com/2025/10/08/the-unknotting-number-is-not-additive/
2•JohnHammersley•52m ago•0 comments

Fuzzing as the basis for effective development a case study of LuaJIT [video]

https://www.youtube.com/watch?v=GwHZaynqh98
2•todsacerdoti•53m ago•0 comments

Room with a View

https://www.thomasmoes.com/52obsessions/room-with-a-view
1•thomoes•55m ago•0 comments

Analysis of 43 official IDF videos–recycled 3D environments by unrelated artists

https://twitter.com/JackSapoch/status/1975965515911921883
7•wahnfrieden•58m ago•1 comments

Show HN: Tdycoder – Local AI code editor using Ollama LLM

https://github.com/TDYSKY/TDYCODER
1•TDYSKY•58m ago•0 comments