Ask HN: What's blocking your AI agents from moving beyond proof-of-concept?

1•ns-148•3h ago

We’ve been working on decision automation tech that’s mostly been used in enterprise for building systems that behave like domain experts. Think models based on structured logic and knowledge, which can be queried to provide decisions that are auditable and explainable. Recently, we’ve started wondering whether this could help with a different kind of problem: getting LLM-based agents into production.

From what we’ve seen (and experienced ourselves), it’s relatively easy to get an agent prototype working with tools like LangChain, AutoGen, or CrewAI, but much harder to move that into something reliable and trustworthy enough for real use.

Some of the issues we’ve felt:

-Agents making different decisions from the same input

-Opaque reasoning that’s hard to debug or trust

-Tool use that works in demos but fails under edge cases

-Hallucinated or incomplete decisions that don’t stand up in production

-Limited ability to gather missing info before acting

It’s got us thinking: if an agent could collate data, then call a tool (our system) with a bespoke symbolic model (that you created) that could reason, ask follow-up questions (for an AI agent or human to answer) and provides results that are deterministic, explainable, and repeatable, would that help bridge the gap to production? Would this be more trustworthy?

We’re trying to understand whether this kind of approach would actually be useful in real-world agent implementations, and if so, for what kinds of decisions or workflows.

Would really appreciate hearing from anyone who’s been working on agent-based systems:

-What have you built?

-Have you shipped anything to production?

-What’s been hardest about that process?

-Where do you think determinism, consistency, or explainability would matter most?

Not selling anything, as we’d have lots of work to do to make the product more developer friendly anyway, just want to know whether the idea has legs and to learn from people building agents.

Thanks in advance to anyone willing to share.

Comments

hammyhavoc•3h ago

That they're complete dogshit in capability, reliability, consistency and vulnerable to malicious prompt fiddling. Wholly inappropriate for production, and for what most people are using them for, there's infinitely better solutions than LLMs.

With the amount of fucking around required trying to correct an LLM, you may as well just write the code to do your task properly.

Show HN: TeamSort – Realtime Social Choice Polling (Borda, Condorcet, IRV)

Kimi-K2-Base

Kimi K2: 1T total parameter open-source LLM by Moonshot AI

Rejigs: Making Regular Expressions Human-Readable

Top DNS domains seen on the Quad9 recursive resolver array each day

Switching to Claude Code and VSCode Inside Docker

Show HN: Director – Local first, open source MCP Gateway

Kiln – Git-native secrets with age encryption for dev workflows

Show HN: Vibe Kanban – Kanban board to manage your AI coding agents

Wikipedia's Overlink Crisis

Show HN: Somo Release 1.1.0

Echoes from the Big Bang suggest Earth is trapped inside a giant cosmic void

Astronomers discover 'fossil galaxy' 3B light-years away

A bionic knee integrated into tissue can restore natural movement

New Horizons visited Pluto 10 years ago. We're still learning from it

Cdp-use: type-safe Python client for the CDP Chrome DevTools Protocol (CDP)

LLMs and Algorithmic Trading

Gemini 2.5 Can Visually Map Text Back to PDFs

I'm Done with Social Media

RealSense spins out of Intel to scale its stereoscopic imaging technology

An Illinois town kept Lidice on the map, a UK town helped rebuild it (2020)

Kysely – The type-safe SQL query builder for TypeScript

Iran expels half a million Afghans in 16-day stretch

Show HN: GitHub View Counter I used with 100k Clicks went Down, so I made my own

How Hot Can It Get? Scientists Are Struggling to Find an Answer

Anxiety drug pregabalin killed my son – and hundreds more are dying from it

'AI boom' pits neighbor vs. neighbor Families flee data centers transform towns

Walking every street in New York City

UBC Okanagan led team to unlock medieval universal history text

The Corset X-Rays of Dr Ludovic O'Followell (1908)