frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
1•gbugniot•1m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
1•throwaw12•3m ago•0 comments

MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•3m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•4m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•6m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•9m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
1•andreabat•12m ago•0 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
1•mgh2•18m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•20m ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•25m ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•26m ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•27m ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•29m ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•31m ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•33m ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•34m ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•37m ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•38m ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•41m ago•1 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
1•__natty__•42m ago•0 comments

Show HN: Android-based audio player for seniors – Homer Audio Player

https://homeraudioplayer.app
3•cinusek•42m ago•1 comments

Starter Template for Ory Kratos

https://github.com/Samuelk0nrad/docker-ory
1•samuel_0xK•44m ago•0 comments

LLMs are powerful, but enterprises are deterministic by nature

2•prateekdalal•47m ago•0 comments

Make your iPad 3 a touchscreen for your computer

https://github.com/lemonjesus/ipad-touch-screen
2•0y•52m ago•1 comments

Internationalization and Localization in the Age of Agents

https://myblog.ru/internationalization-and-localization-in-the-age-of-agents
1•xenator•52m ago•0 comments

Building a Custom Clawdbot Workflow to Automate Website Creation

https://seedance2api.org/
1•pekingzcc•55m ago•1 comments

Why the "Taiwan Dome" won't survive a Chinese attack

https://www.lowyinstitute.org/the-interpreter/why-taiwan-dome-won-t-survive-chinese-attack
2•ryan_j_naughton•55m ago•0 comments

Xkcd: Game AIs

https://xkcd.com/1002/
2•ravenical•57m ago•0 comments

Windows 11 is finally killing off legacy printer drivers in 2026

https://www.windowscentral.com/microsoft/windows-11/windows-11-finally-pulls-the-plug-on-legacy-p...
1•ValdikSS•57m ago•0 comments

From Offloading to Engagement (Study on Generative AI)

https://www.mdpi.com/2306-5729/10/11/172
1•boshomi•59m ago•1 comments
Open in hackernews

Ask HN: How are you scaling AI agents reliably in production?

7•nivedit-jain•5mo ago
I’m looking to learn from people running agents beyond demos. If you have a production setup, would you share what works and what broke?

What I’m most curious about:

- Orchestrator choice and why: LangGraph, Temporal, Airflow, Prefect, custom queues.

- State and checkpointing: where do you persist steps, how do you replay, how do you handle schema changes.

- Concurrency control: parallel tool calls, backpressure, timeouts, idempotency for retries.

- Autoscaling and cost: policies that kept latency and spend sane, spot vs on-demand, GPU sharing.

- Memory and retrieval: vector DB vs KV store, eviction policies, preventing stale context.

- Observability: tracing, metrics, evals that actually predicted incidents.

- Safety and isolation: sandboxing tools, rate limits, abuse filters, PII handling.

- A war story: the incident that taught you a lesson and the fix.

Context (so it’s not a drive-by): small team, Python, k8s, MongoDB for state, Redis for queues, everything custom, experimenting with LangGraph and Temporal. Happy to share configs and trade notes in the comments.

Answer any subset. Even a quick sketch of your stack and one gotcha would help others reading this. Thanks!

Comments

prohobo•5mo ago
I'm using LangGraph for my app which is an AI ecommerce analyst with multiple modes (report builder, and chatbot). It consumes API data and visitor sessions to build a giant report then compress it back down to actionable insights for online store owners. The report runs for each customer once a day, queued up with BullMQ.

It's not super complex, in fact that seems to be the only way to get a more or less reliable agent right now. Keep the graph small, the prompts concise, the nodes and tools atomic in function, etc.

* Orchestrator choice and why: LangGraph because it seems the most robust and well established from my research at the time (about 6 months ago). It has decent documentation, and includes community-built graphs and nodes. People complain a lot about LangChain, but the general vibe around LangGraph is that it's a maturely designed framework.

* State and checkpointing: I'm using a memory checkpointer after every state change. Why? Reports can just re-run at negligible cost. For chats, my users' requirements just don't need persistent thread storage. Persistence is better managed through RAG entries.

* Concurrency control: I don't use parallel tool calling for most of my agents because it adds too much instability to graph execution. This is actually fine for chatbots and my app's reporting system (which doesn't need many tools), but I can see this being an issue for more complex agents.

* Autoscaling and cost: Well I use foundation models, not local ones. I swap out models for various tasks and customer subscription levels (e.g., gpt-5-nano with low reasoning effort for trial users, and gpt-5-mini for paying customers).

* Memory and retrieval: Vector DB for RAG tooling, normal DB for everything else. Sometimes I use the same Postgres database for both vector and normal data, to simplify architecture. I load raw contextual data into prompts (JSON dump). In my app's case, I use a 30-day rolling window of store data so I never keep anything longer than 30 days. I instead keep distilled information as permanent context, which I let the AI control the lifecycle of (create, update, delete).

* Observability: The only thing I would use evals for are prompts, but haven't found a good tool for that yet. I use sentiment analysis for chats the AI deems "interesting" just to see if people are complaining about something.

* Safety and isolation: For reports, I filter out PII before giving data to the AI. For chats, memory checkpointing makes threads ephemeral anyway - and I just add a rate limit + message length limit. The sentiment analysis doesn't include their original messages, only a thematic summary by the AI.

* A war story: I spent weeks trying to fine-tune a prompt for the reporting agent, in which one node was tasked with A) analyzing multiple 30-day ecommerce reports, B) generating findings, C) comparing the findings to existing insights and mutating them, and finally: D) creating short and punchy copy for new insights (title, description). I re-wrote it like 100 times, and every time I ran it it would screw up in a new way or a way that occurred 5 revisions ago. Sometimes it would work perfectly, then the next time it ran it would screw up again, with the same data and temperature set to 0.

This, honestly, is the main problem with modern AI. My solution was to decompose the node into 4 separate ones that each handle a single task - and they still manage to screw it up quite often. It's much better, but not 100% reliable.

nivedit-jain•5mo ago
Thanks for sharing this, truly inspiring. A few questions: (1) What do you like the most about langgraph, have you tried platforms like autogen? (2) Why using BullMQ with node, why not a solution like Temporal? (3) I didn't got you usecase regarding memory check pointer? if things can re-run at negligible cost why do we need it? (4) For sentimental analysis for chats are you using batch inferencing? Probably a loop keeping ready "interesting" chats for review (5) this 30 days analysis is it happening parallelly or is it a sequential loop? why not using something like Airflow for this?