frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: I built an AI agent that turns ROS 2's turtlesim into a digital artist

https://github.com/Yutarop/turtlesim_agent
30•ponta17•1d ago
I'm a grad student studying robotics, with a particular interest in the intersection of LLMs and mobile robots. Recently, I discovered how easily LangChain enables the creation of AI agents, and I wanted to explore how such agents could interact with simulated environments.

So, I built TurtleSim Agent, an AI agent that turns the classic ROS 2 turtlesim turtle into a creative artist.

With this agent, you can give plain English commands like “draw a triangle” or “make a red star,” and it will reason through the instructions and control the simulated turtle accordingly. I’ve included demo videos on GitHub. Behind the scenes, it uses an LLM to interpret the text, decide what actions are needed, and then call a set of modular tools (motion, pen control, math, etc.) to complete the task.

If you're interested in LLM+robotics, ROS, or just want to see a turtle become a digital artist, I'd love for you to check it out:

GitHub: https://github.com/Yutarop/turtlesim_agent

Looking ahead, I’m also exploring frameworks like LangGraph and MCP (Modular Chain of Thought Planning) to see whether they might be better suited for more complex planning and decision-making tasks in robotics. If anyone here is familiar with these frameworks or working in this space, I’d love to connect or hear your thoughts.

Comments

dpflan•1d ago
Forgive me for asking, but im always curios about the definition of “agent”. What is an “agent” exactly? Is it a static prompt that is sent along with user input to an LLM service and then handles that resposne? And then it’s done? Is an agent a prompted LLM call? Or some entity that is changing its own prompt as it continues to exist?
karmakaze•1d ago
It depends on how you look at it. If the output 'it' is a drawing, then the agent is the thing doing the drawing on the user's behalf. In more detail the output thing are commands, so then the agent would be what's generating those commands from the user's input. E.g. a web browser is a user agent that makes requests and renders resources that the user specifies.
ponta17•1d ago
Thanks for the thoughtful question! The term “agent” definitely gets used in a lot of different ways, so I’ll clarify what I mean here.

In this project, an agent is an LLM-powered system that takes a high-level user instruction, reasons about what steps are needed to fulfill it, and then executes those steps using a set of tools. So it’s more than a single prompted LLM call — the agent maintains a kind of working state and can call external functions iteratively as it plans and acts.

Concretely, in turtlesim_agent, the agent receives an input like “draw a red triangle,” and then: 1. Uses the LLM to interpret the intent, 2. Decides which tools to use (like move forward, turn, set pen color), 3. Calls those tools step-by-step until the task is done.

Hope that clears it up a bit!

paxys•1d ago
To put it more simply, "agent" is now just a generic term to describe any middleware that sits between user input and a base LLM.
latchkey•1d ago
This really brings back memories. The first computer language I learned as a child was Logo. My grandfather gifted me a lesson from a local computer store where someone came out to his house and sat with me in front of his Apple II.

I was too young to understand the concepts around the math of steps or degrees. While the thought of programming on a computer was amazing (and later became an engineer), I couldn't grasp Logo, got frustrated, and lost interest.

If I could have had something like this, I'm sure it would have made more sense to me earlier on. It makes me think about how this will affect the learning rate in a positive way.

pj_mukh•1d ago
Haha this is so incredibly cool.

One thing I might’ve missed, what are the “physics” universe? In the rainbow example the turtle seems to teleport between arcs?

ponta17•1d ago
Thanks! Great question.

TurtleSim itself doesn't simulate real-world physics — it allows instant position updates when needed. In this project, the goal was to create a digital turtle artist, not to replicate physical realism. So when the agent wants to draw something, it puts the pen down and moves physically (i.e., using velocity commands). But when it doesn't need to draw and just wants to move quickly to another position, it uses a teleport function I provided as a tool.

That's why in the rainbow example, you might see the turtle "jump" between arcs — it's skipping the movement to get to the next drawing point faster.

moffkalast•1d ago
That's pretty cool, but I feel like all of the LLM integrations with ROS so far have sort of entirely missed the point in terms of useful applications. Endless examples of models sending bare bone twist commands do a disservice to what LLMs are good at, it's like swatting flies with a bazooka in terms of compute used, too.

Getting the robot to move from point A to point B is largely a solved problem with traditional probabilistic methods, while niches where LLMs are the best fit I think are largely still unaddressed, e.g.:

- a pipeline for natural language commands to high level commands ("fetch me a beer" to [send nav2 goal to kitchen, get fridge detection from yolo, open fridge with moveit, detect beer with yolo, etc.]

- using a VLM to add semantic information to map areas, e.g. have the robot turn around 4 times in a room, and have the model determine what's there so it can reference it by location and even know where that kitchen and fridge is in the above example

- system monitoring, where an LLM looks at ros2 doctor, htop, topic hz, etc. and determines if something's crashed or isn't behaving properly, and returns a debug report or attempts to fix it with terminal commands

- handling recovery behaviours in general, since a lot of times when robots get stuck the resolution is simple, you just need something to take in the current situational information, reason about it, and pick one of the possible ways to resolve it

ponta17•1d ago
Thanks a lot for the thoughtful feedback — I really appreciate it!

I think there might be a small misunderstanding regarding how the LLM is actually being used here (and in many agent-based setups). The LLM itself isn’t directly executing twist commands or handling motion; it’s acting as a decision-maker that chooses from a set of callable tools (Python functions) based on the task description and intermediate results.

In this case, yes — one of the tools happens to publish Twist commands, but that’s just one of many modular tools the LLM can invoke. Whether it’s controlling motion or running object detection, from the LLM’s point of view it’s simply choosing which function to call next. So the computational load really depends on what the tool does internally — not the LLM’s reasoning process itself.

Of course, I agree with your broader point: we should push toward more meaningful high-level tasks where LLMs can orchestrate complex pipelines — and I think your examples (like fetch-a-beer or map annotation via VLMs) are spot-on.

My goal with this project was to explore that decision-making loop in a minimal, creative setting — kind of like a sandbox for LLM-agent behavior.

Actually, I’m currently working on something along those lines using a TurtleBot3. I’m planning to provide the agent with tools that let it scan obstacles via 3D LiDAR and recognize objects through image processing, so that it can make more context-aware decisions.

Really appreciate the push for deeper use cases — that’s definitely where I want to go next!

Show HN: MBCompass – Android Compass App

https://github.com/MubarakNative/MBCompass
19•nativeforks•1h ago•2 comments

Show HN: Moon Phase Algorithms for C, Lua, Awk, JavaScript, etc.

https://github.com/oliverkwebb/moonphase
14•oliverkwebb•5h ago•4 comments

Show HN: Agno – A full-stack framework for building Multi-Agent Systems

https://github.com/agno-agi/agno
6•bediashpreet•3h ago•0 comments

Show HN: A small library for stack-trace-like error messages in Rust

https://docs.rs/errors_with_context/latest/errors_with_context/
3•AnyTimeTraveler•2h ago•0 comments

Show HN: I built an AI Agent that uses the iPhone

https://github.com/rounak/PhoneAgent
3•rounak•2h ago•0 comments

Show HN: Patio – Rent tools, learn DIY, reduce waste

https://patio.so
222•GouacheApp•1d ago•145 comments

Show HN: A Implementation of Alpha Zero for Chess in MLX

https://github.com/koogle/mlx-playground/tree/main/chesszero
63•jakobfrick•3d ago•12 comments

Show HN: PunchCard Key Backup

https://github.com/volution/punchcard-key-backup
137•ciprian_craciun•1d ago•45 comments

Show HN: Open-source P2P file transfer

https://github.com/nihaocami/berb
44•goodpanda•1d ago•20 comments

Show HN: Fontofweb – Discover Fonts Used on a Website or Websites Using Font(s)

https://fontofweb.com
63•sim04ful•1d ago•23 comments

Show HN: You2Aanki – Turn Videos into Anki Vocabulary Flashcards

https://you2anki.com/
5•isege•9h ago•3 comments

Show HN: SoloDB – A document database build on top of SQLite with JSONB

https://github.com/Unconcurrent/SoloDB
19•falsename•1d ago•5 comments

Show HN: AI Peer Reviewer – Multiagent system for scientific manuscript analysis

https://github.com/robertjakob/rigorous
107•rjakob•1d ago•86 comments

Show HN: I built an AI agent that turns ROS 2's turtlesim into a digital artist

https://github.com/Yutarop/turtlesim_agent
30•ponta17•1d ago•9 comments

Show HN: Onlook – Open-source, visual-first Cursor for designers

https://github.com/onlook-dev/onlook
222•hoakiet98•3d ago•82 comments

Show HN: Asdf Overlay – High performance in-game overlay library for Windows

https://github.com/storycraft/asdf-overlay
76•storycraft•2d ago•17 comments

Show HN: Smart Silence – Remind your iPhone to stay quiet in quiet places

https://testflight.apple.com/join/47CJ31VK
54•ebagsnave•4d ago•37 comments

Show HN: Icepi Zero – The FPGA Raspberry Pi Zero Equivalent

https://github.com/cheyao/icepi-zero
230•Cyao•4d ago•51 comments

Show HN: MCP Server SDK in Bash

https://github.com/muthuishere/mcp-server-bash-sdk
142•muthuishere•3d ago•36 comments

Show HN: Git-Add–Interactive with Enhancements

https://github.com/cwarden/git-add-interactive
73•xn•2d ago•36 comments

Show HN: W++ – A Python-style scripting language for .NET with NuGet support

https://github.com/sinisterMage/WPlusPlus
94•sinisterMage•2d ago•50 comments

Show HN: A site for YC rejection stories

https://ycrejection.com/
15•khalilosman123•1d ago•10 comments

Show HN: Lazy Tetris

https://lazytetris.com/
433•admtal•6d ago•148 comments

Show HN: Discordz – A simple Discord server directory

https://discordz.com
7•cuplis•21h ago•2 comments

Show HN: PgDog – Shard Postgres without extensions

https://github.com/pgdogdev/pgdog
307•levkk•6d ago•80 comments

Show HN: I wrote a modern Command Line Handbook

https://commandline.stribny.name/
443•petr25102018•3d ago•108 comments

Show HN: Donut Browser, a Browser Orchestrator

https://donutbrowser.com/
88•andrewzeno•3d ago•39 comments

Show HN: I made a Zero-config tool to visualize your code

https://staying.fun/en
134•lezhu•3d ago•46 comments

Show HN: An open-source megarepo turning hackers into frontier AI researchers

https://github.com/tanishqkumar/beyond-nanogpt
3•fizzbuzz07•17h ago•0 comments

Show HN: A new programming language inspired by Go, no LLVM

https://github.com/nature-lang/nature
79•hualaka•3d ago•88 comments