frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Parents say ChatGPT got their son killed with bad advice on party drugs

https://www.theverge.com/ai-artificial-intelligence/928691/openai-chatgpt-wrongful-death-overdose
1•1vuio0pswjnm7•55s ago•0 comments

The Path Not Taken: Duality in Reasoning about Program Execution

https://arxiv.org/abs/2604.20917
1•PaulHoule•1m ago•0 comments

The tipping point: what happens when deaths outnumber births?

https://www.theguardian.com/world/ng-interactive/2026/may/02/what-happens-when-deaths-outnumber-b...
2•rwmj•1m ago•0 comments

AI Will Make the Academic Article Obsolete

https://www.chronicle.com/article/ai-will-make-the-academic-article-obsolete
1•Hooke•1m ago•0 comments

SpaceX and Google Are in Talks to Launch Data Centers in Orbit

https://www.wsj.com/tech/spacex-google-in-talks-to-explore-data-centers-in-orbit-7b7799e2
1•simonebrunozzi•2m ago•0 comments

Show HN: CircadianLab – Browser-based radiosity lighting simulator (WebGPU)

https://www.innerscene.com/tools/circadian-lab
1•jclarkcom•4m ago•0 comments

Show HN: DualDoc – A text editor for the AI age

https://www.dualdoc.xyz
1•jdbiggs•6m ago•0 comments

Show HN: Cutting our credential bulk-import from 17 minutes to 2

https://certscore.org/blog/from-17-min-to-2-min
1•rajatrv•6m ago•0 comments

Show HN: Gigacatalyst – Extend your SaaS with an embedded AI builder

1•namanyayg•7m ago•0 comments

Exim 4.99.3 – SMTP Mail Server – Message Transfer Agent (MTA)

https://lists.exim.org/lurker/message/20260512.145700.55e61dbb.en.html
1•neustradamus•7m ago•0 comments

The Shared Feeling of Being Harvested by the Future

https://www.nytimes.com/2026/05/12/opinion/us-china-ai-future.html
1•littlexsparkee•8m ago•0 comments

MS SQL Arrow

https://devblogs.microsoft.com/python/introducing-apache-arrow-support-in-mssql-python/
2•gritspants•9m ago•1 comments

Trump Moves to Open National Parks, Other Federal Lands, to More Hunting

https://cowboystatedaily.com/2026/05/11/trump-moves-to-open-national-parks-other-federal-lands-to...
2•Bender•10m ago•0 comments

Microsoft researchers find AI models and agents can't handle long-running tasks

https://www.theregister.com/ai-ml/2026/05/11/microsoft-researchers-find-ai-models-and-agents-cant...
2•Bender•10m ago•1 comments

Amazon employees are "tokenmaxxing" due to pressure to use AI tools

https://arstechnica.com/ai/2026/05/amazon-employees-are-tokenmaxxing-due-to-pressure-to-use-ai-to...
6•Bender•11m ago•0 comments

Spotify Is Down

1•circadian•12m ago•0 comments

Show HN: Atlas- Local-first AI code reviewer for Claude, Codex, OpenCode, Cursor

https://www.atlasengine.dev/
1•avinashpdy•13m ago•0 comments

Natural Language Autoencoders: Inside Claude's Activations

https://philippdubach.com/posts/what-claude-thinks-but-doesnt-say/
1•7777777phil•13m ago•0 comments

Beewolf

https://en.wikipedia.org/wiki/Beewolf
1•dr_girlfriend•14m ago•0 comments

Linux bitten by second vulnerability in as many weeks

https://arstechnica.com/security/2026/05/linux-bitten-by-second-severe-vulnerability-in-as-many-w...
1•Brajeshwar•14m ago•0 comments

The unspoken cost of bad communication

https://productnow.ai/blogs/silent-tax-of-bad-communication
1•kadhirvelm•15m ago•1 comments

The agent principal-agent problem

https://crawshaw.io/blog/agent-principal-agent
1•brimtown•15m ago•0 comments

Netflix sued by Texas AG for alleged surveillance, addictive features

https://www.politico.com/news/2026/05/11/netflix-sued-by-texas-ag-for-alleged-surveillance-addict...
2•1vuio0pswjnm7•16m ago•0 comments

The 3-people locked in the room – an experimentation

https://medium.com/doctolib/the-3-people-locked-in-the-room-an-experimentation-4c7a9b2d6def
1•rognjen•16m ago•0 comments

Red Button, Blue Button: Teaching humans and AI to Supercooperate

https://softmax.com/blog/red-button-blue-button
2•yatharth•16m ago•2 comments

California Mayor Resigns, Admitting to Being an Agent for China

https://time.com/article/2026/05/12/arcadia-california-mayor-eileen-wang-agent-china/
4•luispa•17m ago•0 comments

How to Achieve Serverless GPUs

https://modal.com/blog/truly-serverless-gpus
5•charles_irl•17m ago•0 comments

Harvard Students Furious over Plan to Crack Down on Grades

https://www.bloomberg.com/news/articles/2026-05-12/harvard-grade-inflation-plan-to-limit-a-marks-...
1•helsinkiandrew•17m ago•0 comments

Sam Altman takes the stand to testify in Musk suit: Live updates

https://www.cnbc.com/2026/05/12/openai-trial-updates-sam-altman-set-to-testify-in-musk-suit.html
1•1vuio0pswjnm7•17m ago•0 comments

Where Are All the Data Centers?

https://www.wheresyoured.at/where-are-all-the-data-centers/
5•MattRogish•18m ago•1 comments
Open in hackernews

Launch HN: Voker (YC S24) – Analytics for AI Agents

https://voker.ai
9•ttpost•55m ago
Hey HN, we're Alex and Tyler, co-founders of Voker.ai (https://voker.ai/), an agent analytics platform for AI product teams. Voker gives full visibility into what users are asking of your agents, and whether your agents are delivering, without having to dig through logs. Our main product is a lightweight SDK that is LLM stack agnostic and purpose-built for agent products.

Agent Engineers and AI product teams don’t have the right level of visibility into agent performance in production, which results in bad user experiences, churn, and hundreds of hours wasted with spot checks to find and debug issues with agent configurations.

Demo: https://www.tella.tv/video/vid_cmoukcsk1000i07jgb4j65u67/vie...

We recently conducted a survey of YC Founders and 90%+ of respondents said that the only way they know if their Agents are failing users in production is by hearing complaints from customers. They push a prompt change hoping that it fixes the problem and doesn’t break something somewhere else, and the cycle repeats.

We saw tons of observability and evals products popping up to try to address these problems, but we still felt like something was missing in the agent monitoring stack. Obs is good for individual trace debugging but is only accessible to engineers. Evals are good for testing known issues, but don't give insights into trends that teams don’t expect, so engineers are always playing catch up. Traditional product analytics tools do a good job tracking clicks and pageviews across your product surface but weren’t built ground up for agent products. Knowing what users want out of agents, and whether the agent delivered requires specific conversational intelligence / unstructured data processing techniques.

We came up with the agent analytics primitives of Intents, Corrections, and Resolutions to describe something pretty much all conversational agents had in common: a user will always come to an agent with an intent, the user might have to correct this agent on the way to getting their intent resolved, and hopefully every intent a user has is eventually resolved by the agent. Voker processes LLM calls by automatically annotating individual conversations and picking out user intent and corrections. Voker takes these and uses LLMs and hierarchical text classification to create dynamic categories that give higher level insights so you don’t have to read individual conversations to know what are the main usage patterns across your users.

The most common substitute solution we’ve seen is uploading obs logs to Claude or ChatGPT and asking for summary insights. There are a few problems with this - mainly that LLMs aren’t good at math or data science, so you don’t get accurate or consistent statistics. Its highly likely that the LLM overfits to some insights and underfits to others. The LLM isn’t programmatically reading and classifying each individual session or interaction. This is why we don’t use LLMs for any of our core data engineering (processing events, calculating statistics) so the analytics we produce are consistent, reproducible, and accurate. We have a publicly available, lightweight SDK that wraps LLM calls to OpenAI, Anthropic and Gemini in Python and Typescript. Voker handles the data engineering to turn raw data into usable analytics primitives and higher level insights. Free tier: 2,000 events / mo, requires email signup. Paid plans start at $80/mo with a 30 day free trial.

We'd love to hear how you're currently detecting trends, and if you try Voker, tell us what part of our analysis is valuable, and what still feels missing. Thanks for reading, and we’re looking forward to your thoughts in the comments!

Comments

akslp2080•19m ago
How is it different than Langfuse? sorry if I am off the track but Langfuse also provides some detailed tracing of agentic behavior and decisions.
ttpost•7m ago
We get this question a lot! We work hand-in-hand with obs tools like Langfuse. Langfuse is great for debugging technical issues on individual traces like timing conditions that resulted in failed API calls.

Voker focuses on product, business and user outcomes - like what intents did the user bring to your agent that you might not expect. We're built for the whole product team, whereas Langfuse focuses on engineers specifically.

One way to think about it would be: a PM notices in Voker that a new intent category is coming up frequently and the agent isn't handling it well. The PM can dig into the data with visualizations or our conversation reconstructions. Once they confirm its a real issue worth addressing, they can link their investigation to the AI engineer - who can use Voker AND Langfuse to debug and implement a fix/improvement.

Ozzie_osman•9m ago
If the team is here, would love to understand how it compares to something like Amplitude's agent analytics (https://amplitude.com/ai-agents).
ttpost•3m ago
Yeah, this is a confusing one on wording. TLDR: Amplitude is analytics for your web/product data, Voker is analytics for your agent data.

We call Amplitude's feature an "AI Analyst". Essentially Amplitude is layering a LLM copilot on top of their own product - so you don't have to click the buttons or write reports to get insights.

We're an analytics platform built for tracking your agents. Different products with different problems they're solving.

Not sure if this helps, but essentially Amplitude could use Voker to track how well their AI Analyst agent product is actually working!