frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open SWE: An open-source asynchronous coding agent

https://blog.langchain.com/introducing-open-swe-an-open-source-asynchronous-coding-agent/
111•palashshah•6mo ago
https://www.youtube.com/watch?v=TaYVvXbOs8c

https://github.com/langchain-ai/open-swe

Comments

dabockster•6mo ago
> We believe that all agents will long more like this in the future - long running, asynchronous, more autonomous. Specifically, we think that they will:

> Run asynchronously in the cloud

> cloud

Reality check:

https://huggingface.co/Menlo/Jan-nano-128k-gguf

That model will run, with decent conversation quality, at roughly the same memory footprint as a few Chrome tabs. It's only a matter of time until we get coding models that can do that, and then only a further matter of time until we see agentic capabilities at that memory footprint. I mean, I can already get agentic coding with one of the new Qwen3 models - super slowly, but it works in the first place. And the quality matches or even beats some of the cloud models and vibe coding apps.

And that model is just one example. Researchers all over the world are making new models almost daily that can run on an off-the-shelf gaming computer. If you have a modern Nvidia graphics card, you can run AI on your own computer totally offline. That's the reality.

koakuma-chan•6mo ago
Do you know what "MCP-based methodology" is? I am skeptical of a 4B model scoring twice as high as Gemini 2.5 Pro
dabockster•6mo ago
Yeah I know about Model Context Protocol. But it's still only a small part of the AI puzzle. I'm saying that we're at a point now where a whole AI stack can run, in some form, 100% on-device with okayish accuracy. When you think about that, and where we're headed, it makes the whole idea of cloud AI look like a dinosaur.
koakuma-chan•6mo ago
I mean, I am asking what "MCP-based methodology" is, because it doesn't make sense for a 4B model to outperform Gemini 2.5 Pro et al by that much.
toshinoriyagi•6mo ago
I'm not too sure what "MCP-based methodology" is, but Jan-nano-128k is a small model specifically designed to be able to answer in-depth questions accurately via tool-use (researching in a provided document or searching the web).

It outperforms those other models, which are not using tools, thanks to the tool use and specificity.

Because it is only 4B parameters, it is naturally terrible at other things I believe-it's not designed for them and doesn't have enough parameters.

In hindsight, "MCP-based methodology" likely refers to its tool-use.

cbcoutinho•6mo ago
From the paper:

> Most language models face a fundamental tradeoff where powerful capabilities require substantial computational resources. We shatter this constraint with Jan-nano, a 4B parameter language model that redefines efficiency through radical specialization: instead of trying to know everything, it masters the art of finding anything instantly. Fine-tuned from Qwen3-4B using our novel multi-stage Reinforcement Learning with Verifiable Rewards (RLVR) system that completely eliminates reliance on next token prediction training (SFT), Jan-nano achieves 83.2% on SimpleQA benchmark with MCP integration while running on consumer hardware. With 128K context length, Jan-nano proves that intelligence isn't about scale, it's about strategy.

> For our MCP evaluation, we used mcp-server-serper which provides google search and scrape tools

https://arxiv.org/abs/2506.22760

Martinussen•6mo ago
Data storage has gotten cheaper and more efficient/manageable every year for decades, yet people seem content with having less storage than a mid-range desktop from a decade and a half ago, split between their phone and laptop, and leaving everything else to the "> cloud" - I wouldn't be so sure we're going to see people reach for technological independence this time either.
merelysounds•6mo ago
One factor here is people preferring portable devices. Note that portable SSDs are also popular.

Also, usage patterns can be different; with storage, if I use 90% of my local content only occasionally, I can archive that to the cloud and continue using the remaining local 10%.

prophesi•6mo ago
I'm also excited for local LLM's to be capable of assisting with nontrivial coding tasks, but we're far from reaching that point. VRAM remains a huge bottleneck for even a top-of-the-line gaming PC to run them. The best these days for agentic coding that get close to the vibe-check of frontier models seem to be Qwen3-Coder-480B-A35B-Instruct, DeepSeek-Coder-V2-236B, GLM 4.5, and GPT-OSS-120B. The latter being the only one capable of fitting on a 64 to 96GB VRAM machine with quantization.

Of course, the line will always be pushed back as frontier models incrementally improve, but the quality is night and day between these open models consumers can feasibly run versus even the cheaper frontier models.

That said, I too have no interest in this if local models aren't supported and hope that's down the pipeline just so I can try tinkering with it. Though it looks like it utilizes multiple models for various tasks (planner, programmer, reviewer, router, and summarizer) so that only adds to the difficulty of the VRAM bottleneck if you'd like to load different models per task. So I think it makes sense for them to focus on just Claude for now to prove the concept.

edit: I personally use Qwen3 Coder 30B 4bit for both autocomplete and talking to an agent, and switch to a frontier model for the agent when Qwen3 starts running in circles.

diggan•6mo ago
> and GPT-OSS-120B. The latter being the only one capable of fitting on a 64 to 96GB VRAM machine with quantization.

Tiny correction: Even without quantization, you can run GPT-OSS-120B (with full context) on around ~60GB VRAM :)

prophesi•6mo ago
Hm I don't think so. You might be thinking about the file size, which is ~64GB.

> Native MXFP4 quantization: The models are trained with native MXFP4 precision for the MoE layer, making gpt-oss-120b run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and the gpt-oss-20b model run within 16GB of memory.

If you _could_ fit it within ~60GB VRAM, the variability of the amount of VRAM required for certain context lengths and prompt sizes would OOM pretty quickly.

edit: Ah and MXFP4 in itself is a quantization, just supposedly closer to the original FP16 than the rest with a smaller VRAM requirement.

diggan•6mo ago
> Hm I don't think so. You might be thinking about the file size, which is ~64GB.

No, the numbers I put above is literally the VRAM usage I see when I load 120B with llama.cpp, it's a real-life number, not theoretical :)

cowpig•6mo ago
I was excited by the announcement but then

> Runs in an isolated sandbox Every task runs in a secure, isolated Daytona sandbox.

Oh, so fake open source? Daytona is an AGPL-licensed codebase that doesn't actually open-source the control plane, and the first instruction in the README is to sign up for their service.

> From the "open-swe" README:

Open SWE can be used in multiple ways:

* From the UI. You can create, manage and execute Open SWE tasks from the web application. See the 'From the UI' page in the docs for more information.

* From GitHub. You can start Open SWE tasks directly from GitHub issues simply by adding a label open-swe, or open-swe-auto (adding -auto will cause Open SWE to automatically accept the plan, requiring no intervention from you). For enhanced performance on complex tasks, use open-swe-max or open-swe-max-auto labels which utilize Claude Opus 4.1 for both planning and programming. See the 'From GitHub' page in the docs for more information.

* * *

The "from the UI" links to their hosted web interface. If I cannot run it myself it's fake open-source

mitchitized•6mo ago
Hol up

How can it be AGPL and not provide full source? AGPL is like the most aggressive of the GPL license variants. If they somehow circumvented the intent behind this license that is a problem.

Multicomp•6mo ago
Spitballing here but if it's their code that they have copyright on, they can license it to us as agpl, without binding themselves to those same terms. They have all rights as copyright holders regardless of a given license.
victorbjorklund•6mo ago
AGPL is a license to others and not the copyright owner. If you own the copyright you dont need the license at all.
esafak•6mo ago
It's a hosted service with an open source client?
tevon•6mo ago
Very cool! Am using it now and really like the sidebar chat that allows you to add context during a run.

I hit an error that was not recoverable. I'd love to see functionality to bring all that context over to a new thread, or otherwise force it to attempt to recover.

jbl0ndie•6mo ago
> Double texting: Most coding agents don’t support accepting new requests or feedback while they’re running.

This caught my eye too. Given they say 'most', what other tools that support this?

lta•6mo ago
Nice, but I want exactly the opposite. I want my agents to run locally without any sort of black box and I certainly don't want to be stuck with whatever UI you've designed to interact with the git provider you've selected.

It's not a super surprising coming from this pole of over engineering so thick I'm surprised it wasn't developed by Microsoft in the 90s or 00s

kristianp•6mo ago
Yes, where's the open source agent that runs on the command line?
ryuuseijin•6mo ago
It's called opencode: https://opencode.ai/
numpad0•6mo ago
TIL opencode-opencode name conflict was resolved by opencode keeping opencode name and opencode renaming to Crush

1: https://github.com/sst/opencode

2: https://github.com/opencode-ai/opencode

3: https://github.com/charmbracelet/crush

OJFord•6mo ago
Aaah.. ok. And Charm Crush with the weird branding is the one that took/forked it creating the drama and maybe isn't trustworthy.
johntash•6mo ago
Aider and Goose are also open source. Goose is backed by a big company, but Aider isn't and was one of the first (that I know of at least).

https://aider.chat/

https://block.github.io/goose/

IceDane•6mo ago
Unfortunately, after using langchain and the rest of their ecosystem extensively, I have very little faith in their abilities. The fact that the top contributor to langgraph is an agent they built is a huge red flag from my perspective.

France's homegrown open source online office suite

https://github.com/suitenumerique
29•nar001•58m ago•15 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
311•theblazehen•2d ago•105 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
42•alainrk•1h ago•34 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
43•AlexeyBrin•2h ago•8 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
22•onurkanbkrc•1h ago•1 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
722•klaussilveira•16h ago•223 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
984•xnx•22h ago•562 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
107•jesperordrup•6h ago•38 comments

Ga68, a GNU Algol 68 Compiler

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
22•matt_d•3d ago•4 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
79•videotopia•4d ago•12 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
141•matheusalmeida•2d ago•37 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
244•isitcontent•16h ago•27 comments

Cross-Region MSK Replication: K2K vs. MirrorMaker2

https://medium.com/lensesio/cross-region-msk-replication-a-comprehensive-performance-comparison-o...
5•andmarios•4d ago•1 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
247•dmpetrov•17h ago•128 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
346•vecti•19h ago•153 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
511•todsacerdoti•1d ago•249 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
397•ostacke•22h ago•102 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
310•eljojo•19h ago•193 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
48•helloplanets•4d ago•48 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
363•aktau•23h ago•189 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
442•lstoll•23h ago•289 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
77•kmm•5d ago•11 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
98•quibono•4d ago•23 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
26•bikenaga•3d ago•14 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
281•i5heu•19h ago•231 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
48•gmays•11h ago•19 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1092•cdrnsf•1d ago•474 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
312•surprisetalk•3d ago•45 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
160•vmatsiiako•21h ago•73 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
68•gfortaine•14h ago•30 comments