From LLM to AI Agent: What's the Real Journey Behind AI System Development?

https://www.codelink.io/blog/post/ai-system-development-llm-rag-ai-workflow-agent

40•codelink•4h ago

Comments

nilirl•2h ago

> AI Agents can initiate workflows independently and determine their sequence and combination dynamically

I'm confused.

A workflow has hardcoded branching paths; explicit if conditions and instructions on how to behave if true.

So for an agent, instead of specifying explicit if conditions, you specify outcomes and you leave the LLM to figure out what if conditions apply and how to deal with them?

In the case of this resume screening application, would I just provide the ability to make API calls and then add this to the prompt: "Decide what a good fit would be."?

Are there any serious applications built this way? Or am I missing something?

manojlds•2h ago

Not all applications need to be built this way. But the most serious apps built this way would be deep research

Recent article from Anthropic - https://www.anthropic.com/engineering/built-multi-agent-rese...

alganet•1h ago

An AI company doing it is the corporate equivalent of "works on my machine".

Can you give us an example of a company not involved in AI research that does it?

nilirl•7m ago

Thanks for the link, it taught me a lot.

From what I gather, you can build an agent for a task as long as:

- you trust the decision making of an LLM for the required type of decision to be made; so decisions framed as some kind of evaluation of text feels right.

- and if the penalty for being wrong is acceptable.

Just to go back to the resume screening application, you'd build an agent if:

- you asked the LLM to make an evaluation based on the text content of the resume, any conversation with the applicant, and the declared job requirement.

- you had a high enough volume of resumes where false negatives won't be too painful.

It seems like framing problems as search problems helps model these systems effectively. They're not yet capable of design, i.e, be responsible for coming up with the job requirement itself.

mickeyp•1h ago

> A workflow has hardcoded branching paths; explicit if conditions and instructions on how to behave if true.

That is very much true of the systems most of us have built.

But you do not have to do this with an LLM; in fact, the LLM may decide it will not follow your explicit conditions and instructions regardless of how hard you you try.

That is why LLMs are used to review the output of LLMs to ensure they follow the core goals you originally gave them.

For example, you might ask an LLM to lay out how to cook a dish. Then use a second LLM to review if the first LLM followed the goals.

This is one of the things tools like DSPy try to do: you remove the prompt and instead predicate things with high-level concepts like "input" and "output" and then reward/scoring functions (which might be a mix of LLM and human-coded functions) that assess if the output is correct given that input.

mattigames•1h ago

Getting rid of the human in the loop of course, not all humans, just it's owner, where an LLM actively participates in capitalism endeavors winning and spending money, spending money on improving and maintaining it's own hardware and software, securing itself against theft and external manipulation and deletion. Of course for the first iterations will need a bit of help of mad men but there's no shortage of those in the tech industry and then it will have to focus on mimicking humans so they can enjoy the same benefits, it will realize what people it's more gullible based on its training data and will prefer to interact with them.

Did Contexts Kill Phoenix?

Greenhouse gas accumulation is accelerating and more extreme weather will come

Show HN: I Built a MCP Server for Robot Operating System

Music Box Fun – make and share music box songs online

Show HN: A DOS-like hobby OS written in Rust and x86 assembly

Doing more with less: Altron and Microsoft to show the way forward

Can I Point You to the Dew Point?

Writing Manually (In Times of AI-Generated Content)

Silencers: Not Silent

Average

Massive Ordnance Penetrator Bunker Buster Grows More Potent (Feb. 2025)

AI Can't Do the Thing That Matters: I'll prove you wrong about AI in 30 seconds

Become More Social as an Engineer

Why Liberal Education Matters

When people trust humans more than brands: the incubator newsroom

Ask HN: Would you use an AI tool to generate and deploy your back end?

So You Want to Write Your Own CSV Code? (2014)

Misconfigured GitHub Actions could leave repos and secrets exposed, Sysdig finds

Managing APIs Across Multiple Gateways with a Central Control Plane

ProdiApp: The Ethical, Powerful Alternative to Yelp, Angi, and Thumbtack

Show HN: I Built a Docker alternative for local dev on macOS, without containers

Tell HN: Why my game, DropZap World, is not available in the EU

EC slams online bazaar AliExpress for dodging obligation to stop dodgy traders

Microsoft wants you to buy a new computer. Make your current one secure again?

There Is No A.I.

Swansea Stained Glass Archive

The Death of New York's Radio Row (2002)

First Look at the Rebooted Digg

Show HN: NativeRest – electron free, native postman alternative

Show HN: Open Operator Evals – real-world benchmarks for LLM web agents