frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ask HN: What's blocking your AI agents from moving beyond proof-of-concept?

1•ns-148•3h ago
We’ve been working on decision automation tech that’s mostly been used in enterprise for building systems that behave like domain experts. Think models based on structured logic and knowledge, which can be queried to provide decisions that are auditable and explainable. Recently, we’ve started wondering whether this could help with a different kind of problem: getting LLM-based agents into production.

From what we’ve seen (and experienced ourselves), it’s relatively easy to get an agent prototype working with tools like LangChain, AutoGen, or CrewAI, but much harder to move that into something reliable and trustworthy enough for real use.

Some of the issues we’ve felt:

-Agents making different decisions from the same input

-Opaque reasoning that’s hard to debug or trust

-Tool use that works in demos but fails under edge cases

-Hallucinated or incomplete decisions that don’t stand up in production

-Limited ability to gather missing info before acting

It’s got us thinking: if an agent could collate data, then call a tool (our system) with a bespoke symbolic model (that you created) that could reason, ask follow-up questions (for an AI agent or human to answer) and provides results that are deterministic, explainable, and repeatable, would that help bridge the gap to production? Would this be more trustworthy?

We’re trying to understand whether this kind of approach would actually be useful in real-world agent implementations, and if so, for what kinds of decisions or workflows.

Would really appreciate hearing from anyone who’s been working on agent-based systems:

-What have you built?

-Have you shipped anything to production?

-What’s been hardest about that process?

-Where do you think determinism, consistency, or explainability would matter most?

Not selling anything, as we’d have lots of work to do to make the product more developer friendly anyway, just want to know whether the idea has legs and to learn from people building agents.

Thanks in advance to anyone willing to share.

Comments

hammyhavoc•3h ago
That they're complete dogshit in capability, reliability, consistency and vulnerable to malicious prompt fiddling. Wholly inappropriate for production, and for what most people are using them for, there's infinitely better solutions than LLMs.

With the amount of fucking around required trying to correct an LLM, you may as well just write the code to do your task properly.

Show HN: TeamSort – Realtime Social Choice Polling (Borda, Condorcet, IRV)

https://teamsort.world/
1•tonerow•1m ago•0 comments

Kimi-K2-Base

https://huggingface.co/moonshotai/Kimi-K2-Base
1•tosh•1m ago•0 comments

Kimi K2: 1T total parameter open-source LLM by Moonshot AI

https://huggingface.co/moonshotai/Kimi-K2-Instruct
1•cyp0633•2m ago•0 comments

Rejigs: Making Regular Expressions Human-Readable

https://medium.com/@omarzawahry/rejigs-making-regular-expressions-human-readable-1fad37cb3eae
1•thunderbong•2m ago•0 comments

Top DNS domains seen on the Quad9 recursive resolver array each day

https://github.com/Quad9DNS/quad9-domains-top500
1•speckx•2m ago•0 comments

Switching to Claude Code and VSCode Inside Docker

https://timsh.org/claude-inside-docker/
2•timsh•3m ago•0 comments

Show HN: Director – Local first, open source MCP Gateway

2•bwm•4m ago•0 comments

Kiln – Git-native secrets with age encryption for dev workflows

https://kiln.sh/
2•coding_coffee•5m ago•0 comments

Show HN: Vibe Kanban – Kanban board to manage your AI coding agents

https://github.com/BloopAI/vibe-kanban
10•louiskw•5m ago•1 comments

Wikipedia's Overlink Crisis

https://en.wikipedia.org/wiki/Wikipedia:Overlink_crisis
2•thomassmith65•7m ago•0 comments

Show HN: Somo Release 1.1.0

https://github.com/theopfr/somo/releases/tag/v1.1.0
1•hollow64•8m ago•0 comments

Echoes from the Big Bang suggest Earth is trapped inside a giant cosmic void

https://www.livescience.com/space/cosmology/echoes-from-the-big-bang-suggest-earth-is-trapped-inside-a-giant-cosmic-void-scientists-claim
2•Brajeshwar•8m ago•0 comments

Astronomers discover 'fossil galaxy' 3B light-years away

https://www.cnn.com/2025/07/08/science/relic-fossil-galaxy-discovery
1•Brajeshwar•8m ago•0 comments

A bionic knee integrated into tissue can restore natural movement

https://news.mit.edu/2025/bionic-knee-integrated-into-tissue-can-restore-natural-movement-0710
1•Brajeshwar•8m ago•0 comments

New Horizons visited Pluto 10 years ago. We're still learning from it

https://www.sciencenews.org/article/new-horizons-pluto-flyby-anniversary
3•TMEHpodcast•9m ago•0 comments

Cdp-use: type-safe Python client for the CDP Chrome DevTools Protocol (CDP)

https://github.com/browser-use/cdp-use
1•nateb2022•9m ago•0 comments

LLMs and Algorithmic Trading

https://www.architect.co/posts/llms-and-algorithmic-trading
2•auc•11m ago•0 comments

Gemini 2.5 Can Visually Map Text Back to PDFs

https://www.sergey.fyi/articles/using-gemini-for-precise-citations
2•serjester•11m ago•0 comments

I'm Done with Social Media

https://www.carolinecrampton.com/im-done-with-social-media/
12•anarbadalov•14m ago•1 comments

RealSense spins out of Intel to scale its stereoscopic imaging technology

https://techcrunch.com/2025/07/11/realsense-spins-out-of-intel-to-scale-its-stereoscopic-imaging-technology/
1•voxadam•15m ago•0 comments

An Illinois town kept Lidice on the map, a UK town helped rebuild it (2020)

https://www.expats.cz/czech-news/article/an-illinois-town-kept-lidice-on-the-map-a-uk-town-helped-rebuild-it
2•thomassmith65•15m ago•0 comments

Kysely – The type-safe SQL query builder for TypeScript

https://kysely.dev/
1•bundie•16m ago•0 comments

Iran expels half a million Afghans in 16-day stretch

https://www.cnn.com/2025/07/11/world/iran-expels-afghans-un-intl
7•wslh•16m ago•2 comments

Show HN: GitHub View Counter I used with 100k Clicks went Down, so I made my own

https://counter.kuber.studio
3•kuberwastaken•19m ago•1 comments

How Hot Can It Get? Scientists Are Struggling to Find an Answer

https://www.bloomberg.com/news/articles/2025-07-11/how-hot-can-a-heat-wave-get-scientists-struggle-to-find-answers
2•Bluestein•22m ago•0 comments

Anxiety drug pregabalin killed my son – and hundreds more are dying from it

https://www.thetimes.com/uk/article/an-anxiety-drug-killed-my-son-and-hundreds-more-are-dying-from-it-too-ncpswc02g
2•stacktrust•22m ago•0 comments

'AI boom' pits neighbor vs. neighbor Families flee data centers transform towns

https://www.businessinsider.com/data-centers-northern-virginia-noise-air-pollution-cost-2025-5
1•Bluestein•24m ago•0 comments

Walking every street in New York City

https://imjustwalkin.com/
1•geox•25m ago•0 comments

UBC Okanagan led team to unlock medieval universal history text

https://www.todayinbc.com/news/ubc-okanagan-led-team-to-unlock-medieval-universal-history-text-8124044
1•cf100clunk•28m ago•1 comments

The Corset X-Rays of Dr Ludovic O'Followell (1908)

https://publicdomainreview.org/collection/the-corset-x-rays-of-dr-ludovic-o-followell-1908/
1•healsdata•29m ago•0 comments