frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: First autonomous ML and AI engineering Agent

https://marketplace.visualstudio.com/items?itemName=NeoResearchInc.heyneo
2•svij137•2h ago
Founder here. I built NEO, an AI agent designed specifically for AI and ML engineering workflows, after repeatedly hitting the same wall with existing tools: they work for short, linear tasks, but fall apart once workflows become long-running, stateful, and feedback-driven. In real ML work, you don’t just generate code and move on. You explore data, train models, evaluate results, adjust assumptions, rerun experiments, compare metrics, generate artifacts, and iterate; often over hours or days. Most modern coding agents already go beyond single prompts. They can plan steps, write files, run commands, and react to errors.

Where things still break down is when ML workflows become long-running and feedback-heavy. Training jobs, evaluations, retries, metric comparisons, and partial failures are still treated as ephemeral side effects rather than durable state. Once a workflow spans hours, multiple experiments, or iterative evaluation, you either babysit the agent or restart large parts of the process. Feedback exists, but it is not something the system can reliably resume from.

NEO tries to model ML work the way it actually happens. It is an AI agent that executes end-to-end ML workflows, not just code generation. Work is broken into explicit execution steps with state, checkpoints, and intermediate results. Feedback from metrics, evaluations, or failures feeds directly into the next step instead of forcing a full restart. You can pause a run, inspect what happened, tweak assumptions, and resume from where it left off.

Here's an example as well for your reference: You might ask NEO to explore a dataset, train a few baseline models, compare their performance, and generate plots and a short report. NEO will load the data, run EDA, train models, evaluate them, notice if something underperforms or fails, adjust, and continue. If training takes an hour and one model crashes at 45 minutes, you do not start over. Neo inspects the failure, fixes it, and resumes.

Docs for the extension: https://docs.heyneo.so/#/vscode

Happy to answer questions about Neo.

Comments

mring33621•1h ago
I'd love to try this, but i worry about embedded malware or other nastiness in random downloads.

Show HN: I wrapped the Zorks with an LLM

https://infocom.tambo.co/
30•alecf•1h ago•13 comments

Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC

https://emsh.cat/one-human-one-agent-one-browser/
112•embedding-shape•9h ago•69 comments

Show HN: LemonSlice – Upgrade your voice agents to real-time video

48•lcolucci•4h ago•63 comments

Show HN: Script: JavaScript That Runs Like Rust

https://docs.script-lang.org/blog/introducing-script
3•jucasoliveira•1h ago•2 comments

Show HN: I Stopped Hoping My LLM Would Cooperate

2•seanlf•1h ago•0 comments

Show HN: Open-source Robotics – Curated projects with interactive 3D URDF viewer

https://robotics.growbotics.ai/
2•Tomas0413•2h ago•3 comments

Show HN: Distributed Training Observability for PyTorch (TraceML)

https://github.com/traceopt-ai/traceml
2•traceml-ai•2h ago•0 comments

Show HN: A 4.8MB native iOS voice notes app built with SwiftUI

https://apps.apple.com/us/app/convoxa-ai-meeting-minutes/id6755150446
2•karamalaskar•2h ago•0 comments

Show HN: Decrypting the Zodiac Z32 triangulates a 100ft triangular crop mark

https://zenodo.org/records/18335902
4•dstamp•3h ago•1 comments

Show HN: TetrisBench – Gemini Flash reaches 66% win rate on Tetris against Opus

https://tetrisbench.com/tetrisbench/
108•ykhli•1d ago•40 comments

Show HN: Cosmic AI Workflows – Chain AI agents to automate multi-step projects

https://www.cosmicjs.com/blog/introducing-ai-workflows
2•tonyspiro•3h ago•0 comments

Show HN: Only 1 LLM can fly a drone

https://github.com/kxzk/snapbench
175•beigebrucewayne•1d ago•91 comments

Show HN: An open-source starter for developing with Postgres and ClickHouse

https://github.com/ClickHouse/postgres-clickhouse-stack
2•saisrirampur•4h ago•0 comments

Show HN: Lightbox – Flight recorder for AI agents (record, replay, verify)

https://uselightbox.app/
3•Berticus12•5h ago•0 comments

Show HN: Pingram – Send Telegram alerts with 1 line of Python (20KB)

https://github.com/zvizr/pingram
2•Zvizr•5h ago•1 comments

Show HN: Honcho – Open-source memory infrastructure, powered by custom models

https://github.com/plastic-labs/honcho
8•vvoruganti•6h ago•0 comments

Show HN: First autonomous ML and AI engineering Agent

https://marketplace.visualstudio.com/items?itemName=NeoResearchInc.heyneo
2•svij137•2h ago•1 comments

Show HN: I built a CSV parser to try Go 1.26's new SIMD package

https://github.com/nnnkkk7/go-simdcsv
2•tokkyokky•8h ago•0 comments

Show HN: SF Microclimates

https://github.com/solo-founders/sf-microclimates
32•weisser•1d ago•31 comments

Show HN: An interactive map of US lighthouses and navigational aids

https://www.lighthouses.app/
98•idd2•2d ago•21 comments

Show HN: 13-Virtues – A tracker for Benjamin Franklin's 13-week character system

https://www.13-virtues.com
3•HeleneBuilds•9h ago•1 comments

Show HN: TUI for managing XDG default applications

https://github.com/mitjafelicijan/xdgctl
134•mitjafelicijan•2d ago•45 comments

Show HN: Ourguide – OS wide task guidance system that shows you where to click

https://ourguide.ai
39•eshaangulati•1d ago•20 comments

Show HN: Netfence – Like Envoy for eBPF Filters

https://github.com/danthegoodman1/netfence
57•dangoodmanUT•2d ago•7 comments

Show HN: A small programming language where everything is pass-by-value

https://github.com/Jcparkyn/herd
88•jcparkyn•1d ago•57 comments

Show HN: Actionbase – A database for likes, views, follows at 1M+ req/min

https://github.com/kakao/actionbase
4•em3s•11h ago•3 comments

Show HN: Managed Postgres with native ClickHouse integration

44•saisrirampur•5d ago•9 comments

Show HN: Get recommendations or convert agent skills directly in your workspace

https://www.agenstskills.com/
2•rohitghumare•13h ago•0 comments

Show HN: Bonsplit – Tabs and splits for native macOS apps

https://bonsplit.alasdairmonk.com
243•sgottit•2d ago•33 comments

Show HN: C From Scratch – Learn safety-critical C with prove-first methodology

https://github.com/SpeyTech/c-from-scratch
70•william1872•2d ago•12 comments