Ask HN: Best Architecture Patterns for Lightweight SWE Workflows?

1•ethanjscott•1h ago

I am working on a lightweight agentic workflow to accelerate software development on my own machine. I am executing AI-generated code locally on my machine in a sandboxed environment, and feeding the errors back into the LLM automatically.

Has anyone done anything similar on their own machine? I am interested to hear other thoughts.

General overview of the current workflow (without a lot of the finer details):

Call Gemini API to generate Python code for specific function/problem ->

Run AI-generated code in Docker container ->

Take any runtime/compilation errors and feed back to Gemini ->

Run hardcoded tests for functionality and send results back to Gemini ->

Repeat step 1 until max iterations are met (or testing passes)

I had a few general questions:

1) What patterns, antipatterns, or architectures do people find best for these workflows?

2) Is using Docker considered the easiest and safest way of quarantining AI-generated code?

3) I have read a couple posts online about “compressing” the context of previous changes through quick summaries. We can also use vectors for traversing documentation to speed up context retrieval for the AI. Does anyone have any general advice here on what works and what doesn’t?

4) Should I have an external Agent that reviews the errors of the previous iterations to see if Gemini is falling into a loop? Sometimes when I use LLMs for coding, I notice that they fall into loops (they will just alternate between two buggy solutions). Should we just break the iterations in this case and rely on human intervention?

I am also trying to utilize AutoML packages (like FLAML) with this workflow. I am implementing this workflow to perform “automatic” data analysis on datasets to get better predictions. Obviously, I understand that this will not perform as well as a professional data scientist. However, has anyone done anything similar and seen some positive results?

Comments

capestart•1h ago

I’ve found Docker works well with resource limits, a small critic agent helps break error loops, and summarizing diffs beats raw vector search for context; adding auto-tests can also help escape trivial bug-fix loops.

NitpickLawyer•1h ago

I found devcontainers to be the best of both worlds - gives you an environment where the agent can do whatever it needs and you get your IDE as well.

John Searle Has Died

New AI system could accelerate clinical research

Starving, sleeping, and yielding: understanding Go's scheduler

Changes to the Google Maps follow feature

Have Traditional Media Destroyed the Perception of Home?

I Made an Online Game Between Mario 64 and Crash Bandicoot on Real Hardware [video]

90%

Tell Me a Story

Cursor, Copilot, and Windsurf Handle the Same Coding Task

US States spend billions on data center incentives

Show HN: Nihondex Learn Japanese Fast

Show HN: Before vibe coding-vibe defien and design

The JavaScript Handbook (2025 edition)

Does AI Get Bored?

Rails 8 upgrade story: duplicate keys sneaking into our JSON responses

We tried Go's experimental Green Tea garbage collector: didn't help performance

Show HN: Wan 2.5 vs. Veo3 Who Deserves the AI Video Throne?

Show HN: Bibfixer – AI-powered BibTeX cleaner

Linux 6.17 release – Main changes, Arm, RISC-V, and MIPS architectures

Get Rid of Unnecessary Photos

DeepSeek-v3.2-Exp

Ony Ive's latest lightbulb moment? A sailing lantern

Show HN: PaywallPro – Analyze and optimize app subscription paywalls

Avalanche Studios NYC Retrospective – An Ambitious Company Ruined

Discovery of Unstable Singularities

DeepSeek-v3.2-Exp

IceWhale Introduces ZimaOS 1.5: A Simplified, Focused and Open NAS OS

Optimizing a 6502 image decoder, from 70 minutes to 1 minute

Parental Controls

The graduate 'jobpocalypse': Where have all the entry-level jobs gone? [video]

Ask HN: Best Architecture Patterns for Lightweight SWE Workflows?

Comments

John Searle Has Died

New AI system could accelerate clinical research

Starving, sleeping, and yielding: understanding Go's scheduler

Changes to the Google Maps follow feature

Have Traditional Media Destroyed the Perception of Home?

I Made an Online Game Between Mario 64 and Crash Bandicoot on Real Hardware [video]

90%

Tell Me a Story

Cursor, Copilot, and Windsurf Handle the Same Coding Task

US States spend billions on data center incentives

Show HN: Nihondex Learn Japanese Fast

Show HN: Before vibe coding-vibe defien and design

The JavaScript Handbook (2025 edition)

Does AI Get Bored?

Rails 8 upgrade story: duplicate keys sneaking into our JSON responses

We tried Go's experimental Green Tea garbage collector: didn't help performance

Show HN: Wan 2.5 vs. Veo3 Who Deserves the AI Video Throne?

Show HN: Bibfixer – AI-powered BibTeX cleaner

Linux 6.17 release – Main changes, Arm, RISC-V, and MIPS architectures

Get Rid of Unnecessary Photos

DeepSeek-v3.2-Exp

Ony Ive's latest lightbulb moment? A sailing lantern

Show HN: PaywallPro – Analyze and optimize app subscription paywalls

Avalanche Studios NYC Retrospective – An Ambitious Company Ruined

Discovery of Unstable Singularities

DeepSeek-v3.2-Exp

IceWhale Introduces ZimaOS 1.5: A Simplified, Focused and Open NAS OS

Optimizing a 6502 image decoder, from 70 minutes to 1 minute

Parental Controls

The graduate 'jobpocalypse': Where have all the entry-level jobs gone? [video]