frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

What do y'all think – weeknd project

2•Venkymatam•5h ago
Today, many software teams are adding AI features into their apps — like customer support bots, writing tools, or internal copilots — by writing “prompts” directly into their code. These prompts tell the AI what to say or do. But once the product is live, there's no visibility into what the AI is actually saying to users, how much it’s costing, or when things silently go wrong — like hallucinations, tone drift, or token overuse. I’m hoping to build a solution that helps teams keep these AI features healthy and reliable in production. They’ll have a central database to store all their prompts, test different versions across multiple AI models, compare costs and outputs, and — most importantly — evaluate the “human touch” of the responses. The platform would enable A/B testing across prompt versions to identify which responses perform best — whether in terms of marketing impact, sales conversion, engagement, or overall usage. It would track every AI response, detect unusual or risky behavior, and suggest — or even apply — fixes automatically. Think of it as a real-time quality control system for the AI layer of your product. The system would be powered by lightweight autonomous agents that watch every model call, flag anomalies, and make context-aware recommendations — or take direct action when safe to do so. These agents would monitor prompt behavior over time, compare version performance, and optimize for clarity, safety, and cost. Technically, it’s a real-time observability and correction runtime — like Datadog + LaunchDarkly, but built specifically for managing AI prompts and agentic behavior in production.

Comments

airylizard•1h ago
I like the idea, TSCE framework should make the individual agents more reliable and deterministic: https://github.com/AutomationOptimization/tsce_demo

An MCP Ser

1•anahas•2m ago•0 comments

ECE442: Network Science Analytics

https://www.hajim.rochester.edu/ece/sites/gmateos/ECE442.html
1•teleforce•8m ago•0 comments

Claude Code Alternative to CodeX

https://cloudcoding.ai/
2•sean_•12m ago•0 comments

Show HN: MentionedBy AI is now live

https://mentionedby.ai/
1•nikin_mat•12m ago•0 comments

No country for pacifists- What Operation Sindoor reveals about changing opinions

https://www.thehindu.com/society/operation-sindoor-india-pakistan-war-conflict-no-country-for-pacifists/article69574736.ece
1•thunderbong•15m ago•0 comments

Game Hacking – Valve Anti-Cheat (VAC)

https://codeneverdies.github.io/posts/gh-2/
1•msz-g-w•15m ago•0 comments

Hackers Weaponize KeePass Password Manager

https://gbhackers.com/hackers-weaponize-keepass-password-manager-to-spread-malware/
1•mosiuerbarso•17m ago•0 comments

The AI Hype Is Designed to Exploit Your Insecurity

https://tawandamunongo.dev/posts/2025/05/ai-hype-insecurity
1•elcapithanos•21m ago•0 comments

II-Medical – Edge MIT Licensed ChatGPT Level Medical AI

https://ii.inc/web/blog/post/ii-medical
2•emadm•27m ago•0 comments

Show HN: Bulk Crop Images

https://bulkcropimages.com/
2•artiomyak•28m ago•0 comments

AI Keeps Making Beautiful Images. That's the Problem

https://praveen.io/posts/ai-keeps-making-beautiful-images-thats-exactly-the-problem
2•praveeninpublic•32m ago•1 comments

Ask HN: What is the "Like" in a context of social network web-sites?

1•eimrine•32m ago•1 comments

Mexican Navy ship crashes into Brooklyn Bridge leaving two people dead

https://www.theguardian.com/us-news/2025/may/18/mexican-navy-ship-hits-brooklyn-bridge-during-promotional-tour
25•teleforce•33m ago•1 comments

Is the University of Austin Betraying Its Founding Principles?

https://quillette.com/2025/05/16/is-the-university-of-austin-betraying-its-founding-principles/
2•SubiculumCode•35m ago•0 comments

AI Has Changed My Job

https://www.bloomberg.com/news/features/2025-05-14/how-ai-is-changing-american-jobs-from-teachers-to-nurses
5•danso•37m ago•1 comments

Show HN: Keep track of why you muted someone on X

https://github.com/klntsky/x-user-note
3•klntsky•37m ago•0 comments

Ask HN: Why don't more startups use API gateways?

3•smrth•38m ago•1 comments

Mountains with a Fuji View

https://www.emgoto.com/fuji-view-hikes/
1•nivethan•54m ago•0 comments

Show HN: EndurePath – Fast emotion tracking, tiny boosters, and gamified growth

https://endurepath.com
1•lectoratium•55m ago•0 comments

A MiniKvm to rule all machines remotely

https://www.andrea-allievi.com/blog/a-minikvm-to-rule-all-machines-remotely/
2•transpute•1h ago•0 comments

Google's AlphaEvolve: ToolKami Style

https://toolkami.com/alphaevolve-toolkami-style/
3•SafeDusk•1h ago•1 comments

AniGen – The First Short-Form AI Animation Competition

https://komiko.app/anigen-competition
2•PaulineGar•1h ago•0 comments

Ask HN: Do people actually pay for small web tools?

6•scratchyone•1h ago•3 comments

Google says Linux Terminal VM feature isn't replacement for Android desktop mode

https://www.androidauthority.com/android-linux-terminal-purpose-3535765/
4•transpute•1h ago•0 comments

White tiger controversy: Zoos shouldn't raise these animals. (2012)

https://slate.com/technology/2012/12/white-tiger-controversy-zoos-shouldnt-raise-these-inbred-ecologically-irrelevant-animals.html
3•Tomte•1h ago•0 comments

What Does 'Off the Record' Mean? (2018)

https://www.nytimes.com/2018/08/02/reader-center/off-the-record-meaning.html
3•Tomte•1h ago•0 comments

Last NeXT Computer Website

https://web.archive.org/web/20001215082000/http://www.apple.com/enterprise/
2•behnamoh•1h ago•0 comments

Why Gamers Are Furious over Take-Two and 2K's New Terms of Service

https://medium.com/@DarkRa/why-gamers-are-furious-over-take-two-and-2ks-new-terms-of-service-051e7a6a5594
2•Throwthrowbob•2h ago•0 comments

Fearmongering

https://en.wikipedia.org/wiki/Fearmongering
13•downboots•2h ago•0 comments

There Won't Be Spam Ads in the Future – Thanks to AI

2•mrkrieg•2h ago•7 comments