Might poke around...
What makes something a good potential tool, if the shell command can (technically) can do anything - like running tests?
(or it is just the things requiring user permission vs not?)
Think of it as -semantic- wrappers so the LLM can -decide- what action to take at any given moment given its context, the user prompt, and available tools names and descriptions.
creating wrappers for the most used basic tools even if they all pipe to terminal unix commands can be useful.
also giving it speicif knowledge base it can consult on demand like a wiki of its own stack etc
but yeah
The shell command can run anything really. When I tested it, it asked me multiple times to run the tests and then I could see it fixing the tests in iterations. Very interesting to observe.
If I was to improve this to be a better Ruby agent (which I don't plan to do, at least not yet), I would probably try adding some Rspec/Minitest specific tools that would parse the response and present it back to the LLM in a cleaned up format.
I'm being serious. This sounds like a fun project but I have to turn my attention to other projects for the near future. This was more of an experiment for me, but it would be cool to see someone try out that idea.
(Like - what would it look like to clean up test results for an LLM?)
Agents are the new PHP scripts!
Using .fetch with a default of nil is what's arguably not very useful.
IMO it's just a robocop rule to use .fetch, which is useful in general for exploding on missing configuration but not useful if a missing value is handled.
I took the code from RubyLLM configuration documentation. If you're pulling in a lot of config options and some have default values then there's value in symmetry. Using fetch with nil communicates clearly "This config, unlike those others, has no default value". But in my case, that benefit is not there so I think I'll change it to your suggestion when I touch the code again.
{}["key"] # KeyError in Python
OpenAI launched function calls two years ago and it was always possible to create a simple coding agent.
When I realised that it's mostly in the LLM I found that a bit surprising. Also, since I'm not an AI Engineer, I was happy to realise that my "regular programming" skills would be enough if I wanted to build a coding agent.
It sounds like you were aware of that for a while now, but I and a lot of other people weren't. :)
That was my motivation for writing the article.
https://blog.scottlogic.com/2023/05/04/langchain-mini.html
It is of course quite out of date now as LLMs have native tool use APIs.
However, it proves a similar point to yours, in most applications 99% of the power is within the LLM. The rest is often just simple plumbing.
Does that mean that it wouldn’t work with other LLMs?
E.g. I run Qwen3-14B locally; would that or any other model similar in size work?
Claude is just an example. I pulled the actual payloads by looking at what is actually being sent to Claude and what it is responding. It might vary slightly for other providers. I used Clause because I already had a key ready from trying it out before.
I wonder if AIs that receive this information within their prompt might try to change the user’s mind as part of reaching their objective. Perhaps even in a dishonest way.
To be safe I’d write “error: Command cannot be executed at the time”, or “error: Authentication failure”. Unless you control the training set; or don’t care about the result.
Interesting times.
Either the user needs to be educated or we need to restrict what the user themselves can do.
I always put extra effort into trying to make my blog posts shorter without sacrificing the quality. I think good technical writing should transfer the knowledge while requesting the least amount of time possible from the reader.
Language expressiveness is more about making the interface support more use case while still being as concise. And Ruby is really good at this, better than most languages.
I suppose we have to define expressiveness (conciseness, abstraction power, readability, flexibility?), because Ruby, for example, has human-readable expressiveness, Common Lisp has programmable expressiveness, and Forth has low-level expressiveness, so they all have some form of expressiveness.
I think Ruby, Crystal, Rebol 3, and even Nim and Lua have a similar form or type of expressiveness.
If you say that expressivity is the ability to implement a program in less lines of code then Ruby is more expressive than most but less than for example Clojure. Well written Clojure can be incredibly expressive. However, you can argue that for most people it's going to be less readable than a comparable Ruby program.
It's hard to talk about these qualities as there's a fair amount of subjectivity involved.
But yeah, you are right, there is too much subjectivity involved in all of this. :)
Anyways, I hope you know I did not mean to use any of my comments against you, I was just wondering.
It's an interesting conversation.
Basically, what I wanted to say was: "Here is an article on building a prototype coding agent in Ruby that explains how it works and the code is just 94 lines so you'll really be able to get a good understanding just by reading this article."
But that's a bit too long for a title. :)
When understanding a certain concept, it's very useful to be able to see just the code that's relevant to the concept. Ruby language design enables that really well. Also, Ruby community in general puts a lot of value on readability. Which is why with Ruby it's often possible to eliminate almost all of the boilerplate while still keeping the code relatively flexible.
You can make it much more than just a coding agent. I personally use my personal LLMs for data analysis by integrating it with some APIs.
These type of LLM systems are basically acting as a frontend now that respond to very fuzzy user input. Such an LLM can reach out to your own defined functions (aka a backend).
The app space that I think is interesting and that I'm working on is creating these systems combined with some solid data creating advicing/coaching/recommendation systems.
If you want some input on building something like that, my email is in my profile. Currently I'm playing around with an LLM chat interface with database access that gives study advice based on:
* HEXACO data (personality)
* Motivational data (self-determination theory)
* ESCO data (skills data)
* Descriptions of study programs described in ESCO data
If you want to chat about creating these systems, my email is in my profile. I'm currently also looking for freelance opportunities based on things like this as I think there are many LLM applications to which we've only scratched the surface.
Code was always read more than written. With AI it shifts even more towards reading so language readability becomes even more important. And Ruby really shines there.
You’d want the opposite, a language with automatically checked constraints that is also easy to read.
Btw, it's not even about the RubyLLM gem. The gem abstracts away the calling of various LLM providers and gives a very clean and easy to use interface. But it's not what gives the "agentic magic". The magic is pretty much all in the underlying LLMs.
Seeing all the claims made by some closed source agent products (remember the "world's first AI software engineer"?) I thought that a fair amount of AI innovation is in the agent tool itself. So I was surprised when I realised that almost all of the "magic" parts are coming from the underlying LLM.
It's also kind of nice because it means that if you wanted to work on an agent product you can do that even if you're not an AI specialised engineer (like I am not).
rbitar•1mo ago
radanskoric•1mo ago
It's clearly not a full featured agent but the code is here and it's a nice starting point for a prototype: https://github.com/radanskoric/coding_agent
My best hope for it is that people will use it to experiment with their own ideas. So if you like it, please feel free to fork it. :)