frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The antiderivative of 1/x is ln(x), not ln |x

https://blog.danielh.cc/blog/ln
1•max__dev•2m ago•0 comments

Celebrities relaunch a McCarthy-era committee to defend free speech

https://www.npr.org/2025/10/01/nx-s1-5559223/committee-for-the-first-amendment-jane-fonda-billie-...
1•westurner•4m ago•1 comments

Google's Gemini-powered smart home revamp is here with a new app and cameras

https://arstechnica.com/google/2025/10/googles-gemini-powered-smart-home-revamp-is-here-with-a-ne...
1•canucker2016•4m ago•0 comments

Pager Code Encoder Implementing 90's Look Alike Numeric Cipher

https://fcsuper.blogspot.com/2025/10/numeric-pager-code-encoder-tool-look.html
1•fcsuper•9m ago•1 comments

He Grew Obsessed with an AI Chatbot. Then He Vanished in the Ozarks

https://www.rollingstone.com/culture/culture-features/ai-chatbot-disappearance-jon-ganz-1235438552/
1•CommieBobDole•10m ago•0 comments

AéPiot: A Comprehensive Analysis of the Semantic Web Infrastructure Platform

https://better-experience.blogspot.com/2025/10/aepiot-comprehensive-analysis-of.html
1•aePiots•11m ago•1 comments

I/Q Data for Dummies

http://whiteboard.ping.se/SDR/IQ
1•Betty_rs•11m ago•0 comments

Ecce Homo (García Martínez and Giménez)

https://en.wikipedia.org/wiki/Ecce_Homo_(García_Martínez_and_Giménez)
2•surprisetalk•13m ago•0 comments

M5 iPad Pro scores 4133, matches M4 Max, beats every single-core PC chip score

https://www.tomshardware.com/tech-industry/m5-powered-ipad-pro-breaks-cover-in-geekbench-scoring-...
1•walterbell•14m ago•0 comments

A Thermometer for Measuring Quantumness

https://www.quantamagazine.org/a-thermometer-for-measuring-quantumness-20251001/
2•pykello•16m ago•0 comments

Linus Torvalds Lashes Out at RISC-V Big Endian Plans

https://www.phoronix.com/news/Torvalds-No-RISC-V-BE
3•signa11•24m ago•1 comments

YouTube to NotebookLM

https://help.gsctool.com/features/free-tools/youtube-to-notebooklm
1•trungpv1601•25m ago•0 comments

ADSB but 3D Live and Interactive

https://objectiveunclear.com/airloom.html
1•benlimner•28m ago•0 comments

Unions?

1•Glibly•33m ago•2 comments

Sally Mann – A deep connection with the American South

https://www.youtube.com/watch?v=b0cE2a62Jgo
1•fallinditch•33m ago•0 comments

Ask HN: Who's building AI for U.S. customs classification?

2•aswfaswf•34m ago•0 comments

Sophist

https://en.wikipedia.org/wiki/Sophist
4•aaavl2821•42m ago•0 comments

Brother, I am troubled [video]

https://www.youtube.com/watch?v=yA5lujNlkn8
1•keepamovin•46m ago•1 comments

Show HN: Notestorm – a privacy-first AI scratchpad I made for quick idea dumps

https://notestorm.wastu.net/
1•wastu•51m ago•0 comments

The Devastating Decline of a Brilliant Young Coder (2020)

https://www.wired.com/story/lee-holloway-devastating-decline-brilliant-young-coder/
4•measurablefunc•51m ago•0 comments

Gene Ray (Time Cube Guy)

https://web.archive.org/web/20160103165000/http://www.timecube.com/timecube2.html
1•EasyJapaneseBoy•1h ago•2 comments

Show HN: Agent Message Transfer Protocol

https://amtp-protocol.org/
1•wang_cong•1h ago•0 comments

iPhone 17 Pro Camera Review: Rule of Three

https://www.lux.camera/iphone-17-pro-camera-review-rule-of-three/
1•ValentineC•1h ago•0 comments

Companies Should Prioritize Culture over Obsession with AI Tools

https://newsletter.eng-leadership.com/p/companies-should-stop-obsessing-over
2•birdculture•1h ago•0 comments

Dynamic Denial of Crawlers

https://overengineer.dev/blog/2025/07/11/dynamic-denial-of-crawlers/
2•mooreds•1h ago•0 comments

Verify Identities During Self-Service Registration

https://fusionauth.io/blog/identity-verification-before-registration
1•mooreds•1h ago•0 comments

Does Your Backyard Need a Stegosaurus?

https://www.nytimes.com/2025/10/01/nyregion/new-jersey-dinosaur-sale.html
1•mooreds•1h ago•0 comments

The Fatima Sun Miracle: More Than You Wanted to Know

https://www.astralcodexten.com/p/the-fatima-sun-miracle-much-more
2•paulpauper•1h ago•0 comments

Network State, or a Network of States?

https://www.noahpinion.blog/p/network-state-or-a-network-of-states
2•paulpauper•1h ago•0 comments

Alcohol in Early America

https://everything-everywhere.com/alcohol-in-early-america/
3•surprisetalk•1h ago•1 comments
Open in hackernews

An Open-Source Framework for Building Stable and Reliable LLM-Powered Systems

https://chatbot-testing-framework.readthedocs.io/en/latest/
2•alexostrovskyy•1h ago

Comments

alexostrovskyy•1h ago
I think many of us have felt the pain of building a cool LLM-powered application or RAG pipeline, only to find it's too brittle and unpredictable for real-world use. The core problem is that they are black boxes. When they fail, it's hard to know why.

I've been focused on this problem of "productionizing" AI workflows. It's not just about testing; it's about deep observability, performance tuning, and building systems you can trust to be stable.

I wrote up a guide on a methodology I've found very effective. It's based on an open-source framework that uses decorators to trace the entire execution path of a chatbot. This gives you the data to:

- Pinpoint Performance Bottlenecks: See the exact latency of every LLM call, tool use, and retrieval step. - Automate Quality Control: Use an LLM-as-a-judge to programmatically check for hallucinations (groundedness), safety violations, and adherence to custom rules. - Create a Feedback Loop for Improvement: When you change a prompt or logic, you can run the test suite and get a concrete report on whether performance and reliability have improved or worsened.

You can read the guide here: - LangChain-based application: https://alexostrovskyy.com/the-glass-box-why-your-chatbot-ne..., - LlamaIndex-based application: https://alexostrovskyy.com/production-llm-chatbot-tracing-an...

I’ve created this open-source project to use in my projects and help other creators.

My goal is to create a framework (open-source) that can help us build stable, trustworthy AI systems, not just clever demos.

I'd be very interested to hear feedback from other engineers and creators.