frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Lazarus, a coding agent for long-horizon tasks

https://github.com/ExpressGradient/lazarus
1•Sai_Praneeth•1h ago
I have been interested in long-horizon coding tasks for a while, especially with benchmarks like FrontierSWE, where even the best coding agents like Codex and Claude Code struggle to complete tasks.

These agents come with a collection of tools like bash, file edits, grep, glob, etc.

Lazarus takes a different approach. The idea is to give the model exactly one tool: a persistent Python runtime.

Model writes Python code, executes it, and receives stdout/stderr. Through Python it inspects repos, reads and edits files, runs builds, executes tests, invokes linters, even build custom harnesses and automate whatever workflows it needs.

The motivation for this was: - Tool selection itself is a planning problem.

- Specialized tools are often difficult to compose together efficiently.

- Long-horizon tasks frequently require custom workflows that predefined tools don't provide.

- Python is expressive enough for the model to build those workflows itself.

Another decision is avoid agent hierarchies. Lazarus runs a single tool-calling loop rather than managers, planners, and worker agents.

The intuition being current models are much better at writing code than coordinating fleets of agents. Agent orchestration consumes context, introduces extra modes of failure, and adds complexity.

How does Lazarus manage context? When the "usable" context window of a model is nearly exhausted, the model gets one final opportunity to execute a Python tool call, containing anything it wants to preserve: notes, plans, functions, summaries, partial results, etc.

The loop is then restarted with only:

- The original user task

- The carryover cell

- The carryover cell's output

This allows the agent to periodically compress its own state and continue working without requiring an ever-growing context window.

I evaluated Lazarus on two FrontierSWE tasks: - git-to-zig (rewriting git in zig) - dart-style-haskell (rewriting dart-style formatter in haskell)

The runs with scores are available here: https://github.com/ExpressGradient/frontier-swe-lazarus-runs

Using GPT-5.5 at medium reasoning effort, Lazarus achieved scores comparable to reported GPT-5.5 in Codex with xhigh reasoning.

The runs were not completed to exhaustion, I stopped them because I ran out of OpenAI credits. So I suspect there is still room for improvement from longer runtimes and higher reasoning.

The project is still early, but the results made me wonder whether coding agents have become over-specialized around tool collections and orchestration systems, while under-investing in giving models a programmable environment they can shape themselves.

Lazarus: https://github.com/ExpressGradient/lazarus

Noctalia Shell Plugin to Run the Pi Coding Assistant

https://github.com/rcarmo/pi-noctalia-shell-plugin
1•rcarmo•1m ago•0 comments

Brave is charging $60 to remove features it added in the first place

https://www.xda-developers.com/brave-is-charging-60-to-remove-features-it-added-in-the-first-place/
2•CharlesW•2m ago•0 comments

Making Claude a Chemist

https://www.anthropic.com/research/making-claude-a-chemist
1•thatxliner•3m ago•0 comments

Open Source Aviation Maps

https://tech.marksblogg.com/aviation-maps.html
1•marklit•7m ago•0 comments

Elon Musk Is Dropping a Boulder in a Kiddie Pool

https://www.theatlantic.com/technology/2026/06/spacex-ipo-anthropic-openai/687443/
1•samizdis•7m ago•0 comments

Internet Age-Gates Are a Growing Global Threat

https://www.eff.org/deeplinks/2026/06/internet-age-gates-are-growing-global-threat
1•berlianta•8m ago•0 comments

Nasdaq Sinks 4% over AI and Rate-Hike Fears

https://www.wsj.com/livecoverage/may-jobs-report-stock-market-06-05-2026
1•JumpCrisscross•9m ago•0 comments

Miami-Dade Mayor to Use Eminent Domain to Seize Fisher Island Property

https://www.wsj.com/real-estate/miami-dade-mayor-to-use-eminent-domain-to-seize-fisher-island-pro...
1•throwawa1•9m ago•0 comments

Against shallow anti-rational humanism

https://statmodeling.stat.columbia.edu/2026/06/04/against-shallow-humanism/
1•caminanteblanco•10m ago•0 comments

Complex Systems Visualizer

https://complexsystems.replit.app/
2•gnodar•12m ago•0 comments

Meta's stock sinks on report company could raise billions for AI push

https://www.cnbc.com/2026/06/05/meta-stock-sinks-on-report-company-could-raise-tens-of-billions-f...
2•01-_-•14m ago•0 comments

Alpha School Costs $65,000 a Year–But Isn't Actually a School

https://www.wired.com/story/alpha-schools-new-york-city-campus-isnt-actually-a-school/
1•johnshades•16m ago•0 comments

Show HN: TapeSim – Practice Reading the Tape

https://tapesim.app/
1•abstractcontrol•16m ago•0 comments

NOAA Satellite Captures Rare Imagery of "Interstate-Induced" Clouds

https://www.nesdis.noaa.gov/news/noaa-satellite-captures-rare-imagery-of-interstate-induced-clouds
4•reaperducer•17m ago•0 comments

Ask HN: Has the vegan movement been effective?

2•xg15•19m ago•1 comments

Unpatched Ollama Vulnerabilities: Phishing Overlays and Data Exfiltration

https://www.promptarmor.com/resources/unpatched-ollama-vulnerabilities-phishing-overlays-and-data...
1•gathorway•20m ago•0 comments

How the Largest IPO in history became your problem

https://medium.com/@firstfromreverse/how-the-largest-ipo-in-history-quietly-became-your-problem-c...
3•WishingWisp•24m ago•0 comments

Bringing a dead Spring Boot project back to life with Claude

https://tomaytotomato.com/spring-data-solr-lazarus/
1•tomaytotomato•29m ago•0 comments

Google Buying Computing from SpaceX in $920M-a-Month Deal

https://www.bloomberg.com/news/articles/2026-06-05/google-buying-computing-from-spacex-in-920-mil...
7•berlianta•31m ago•0 comments

GitHub Accidentally Deletes Slack and Teams Subscriptions

https://www.githubstatus.com/incidents/2nmfnbknhlnv
46•SparkyDogs•32m ago•15 comments

An imperative command-line-interface for AI workload orchestration

https://pypi.org/project/terradev-cli/
1•Facingsouth•32m ago•1 comments

I need financial help, pls share

5•alonsovm44•33m ago•0 comments

Autonomous Agentic Design for Photonics

https://arxiv.org/abs/2606.00915
1•twhughes•34m ago•1 comments

Tokugawa Japan kept 260 warlords from war for 250 years

https://jivx.com/edo
3•momentmaker•34m ago•0 comments

Show HN: Bash Runtime for AWS Lambda

https://github.com/interchecks/bash-lambda-runtime
2•tw1gz•34m ago•0 comments

ICEs Plan to Let Cops Around the Country Scan Faces to Verify Immigration Status

https://www.404media.co/ices-plan-to-let-cops-around-the-country-scan-faces-to-verify-immigration...
4•emschwartz•35m ago•0 comments

iPhone Deployment of End-to-End Perception via Auto-Labeled Synthetic Data

https://arxiv.org/abs/2604.25949
1•PaulHoule•36m ago•0 comments

VibeOS

https://en.wikipedia.org/wiki/VibeOS
2•maayank•37m ago•0 comments

Show HN: TuringLLM – a LLM-powered Universal Turing machine

https://github.com/gmlion/TuringLLM
1•gmlion•37m ago•0 comments

Apple Silicon's on-device AI bet hasn't moved – only the chip range that runs it

https://tbreak.com/apple-silicon-on-device-ai-doug-brooks-wwdc/
3•Austin_Conlon•38m ago•0 comments