frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Lumina – Open-source observability for AI systems(OpenTelemetry-native)

https://github.com/use-lumina/Lumina
1•Evanson•2h ago
Hey HN! I built Lumina – an open-source observability platform for AI/LLM applications. Self-host it in 5 minutes with Docker Compose, all features included.

The Problem:

I've been building LLM apps for the past year, and I kept running into the same issues: - LLM responses would randomly change after prompt tweaks, breaking things. - Costs would spike unexpectedly (turns out a bug was hitting GPT-4 instead of 3.5). - No easy way to compare "before vs after" when testing prompt changes. - Existing tools were either too expensive or missing features in free tiers.

What I Built:

Lumina is OpenTelemetry-native, meaning: - Works with your existing OTEL stack (Datadog, Grafana, etc.). - No vendor lock-in, standard trace format. - Integrates in 3 lines of code.

Key features: - Cost & quality monitoring – Automatic alerts when costs spike, or responses degrade. - Replay testing – Capture production traces, replay them after changes, see diffs. - Semantic comparison – Not just string matching – uses Claude to judge if responses are "better" or "worse." - Self-hosted tier – 50k traces/day, 7-day retention, ALL features included (alerts, replay, semantic scoring)

How it works:

```bash # Start Lumina git clone https://github.com/use-lumina/Lumina cd Lumina/infra/docker docker-compose up -d ```

```typescript // Add to your app (no API key needed for self-hosted!) import { Lumina } from '@uselumina/sdk';

const lumina = new Lumina({ endpoint: 'http://localhost:8080/v1/traces', });

// Wrap your LLM call const response = await lumina.traceLLM( async () => await openai.chat.completions.create({...}), { provider: 'openai', model: 'gpt-4', prompt: '...' } ); ```

That's it. Every LLM call is now tracked with cost, latency, tokens, and quality scores.

What makes it different:

1. Free self-hosted with limits that work – 50k traces/day and 7-day retention (resets daily at midnight UTC). All features included: alerts, replay testing, and semantic scoring. Perfect for most development and small production workloads. Need more? Upgrade to managed cloud.

2. OpenTelemetry-native – Not another proprietary format. Use standard OTEL exporters, works with existing infra. Can send traces to both Lumina AND Datadog simultaneously.

3. Replay testing – The killer feature. Capture 100 production traces, change your prompt, replay them all, and get a semantic diff report. Like snapshot testing for LLMs.

4. Fast – Built with Bun, Postgres, Redis, NATS. Sub-500ms from trace to alert. Handles 10k+ traces/min on a single machine.

What I'm looking for:

- Feedback on the approach (is OTEL the right foundation?) - Bug reports (tested on Mac/Linux/WSL2, but I'm sure there are issues) - Ideas for what features matter most (alerts? replay? cost tracking?) - Help with the semantic scorer (currently uses Claude, want to make it pluggable)

Why open source:

I want this to be the standard for LLM observability. That only works if it's: - Free to use and modify (Apache 2.0) - Easy to self-host (Docker Compose, no cloud dependencies) - Open to contributions (good first issues tagged)

The business model is managed hosting for teams that don't want to run infrastructure. But the core product is and always will be free.

Try it: - GitHub: https://github.com/use-lumina/Lumina - Docs: https://docs.uselumina.io - Quick start: 5 minutes from `git clone` to dashboard

I'd love to hear what you think! Especially interested in: - What observability problems are you hitting with LLMs - Missing features that would make this useful for you - Any similar tools you're using (and what they do better)

Thanks for reading!

Methods for protecting yourself against an LRAD system – Tech Ingredients (2020) [video]

https://www.youtube.com/watch?v=CXKTBQBugIA
1•goda90•38s ago•0 comments

Forever Overhead – David Foster Wallace

https://welcometotheloonybin.wordpress.com/2008/09/17/forever-overhead/
1•ofalkaed•1m ago•0 comments

MCP Apps

http://blog.modelcontextprotocol.io/posts/2026-01-26-mcp-apps/
1•sanj•3m ago•0 comments

Ask HN: How to avoid skill atrophy in LLM-assisted programming era?

2•py4•4m ago•0 comments

Pretty much 100% of our code is written by Claude Code and Opus 4.5

https://twitter.com/bcherny/status/2015979257038831967
1•sysoleg•4m ago•0 comments

Stanford scientists reveal oldest map of the night sky

https://www.kqed.org/news/12070647/stanford-scientists-reveal-oldest-map-of-the-night-sky-previou...
1•dr_dshiv•6m ago•0 comments

AI and Society: The Three Phases of Technological Adoption

https://ure.us/articles/ai-and-society-the-three-phases-of-technological-adoption/
1•sschotten•6m ago•0 comments

OpenAI Prism

https://openai.com/prism/
1•davidbarker•7m ago•0 comments

Show HN: LemonSlice – Give your voice agents a face

6•lcolucci•8m ago•0 comments

Ag-jail – Sandbox antigravity to avoid persistant/background process

https://github.com/M-Wham/ag-jail
1•mwham•9m ago•1 comments

Clawdbot is a security nightmare [video]

https://www.youtube.com/watch?v=kSno1-xOjwI
4•carlos-menezes•9m ago•0 comments

Southwest's Open-Seating Era Comes to an End

https://www.wsj.com/lifestyle/travel/my-last-dash-for-open-seats-on-southwest-90aec391
1•JumpCrisscross•10m ago•0 comments

Show HN: AnalysisXYZ – Browser-based CSV/Excel analyzer (privacy focused)

https://www.analysisxyz.dev
1•kushagarwal2907•12m ago•1 comments

Ask HN: How do you manage memory and context across Claude Code sessions?

1•nadis•13m ago•0 comments

Prep Early to Land an Overseas Job

https://relocateme.substack.com/p/how-to-prepare-for-an-overseas-job
1•andrewstetsenko•15m ago•0 comments

The Doomsday Clock is now at 85 seconds to midnight

https://thebulletin.org/doomsday-clock/
3•pbhak•15m ago•0 comments

Show HN: An open-source starter for developing with Postgres and ClickHouse

https://github.com/ClickHouse/postgres-clickhouse-stack
1•saisrirampur•16m ago•0 comments

UPS to cut additional 30,000 jobs in Amazon unwind, turnaround plan

https://www.cnbc.com/2026/01/27/ups-job-cuts-amazon-unwind-turnaround-plan.html
5•belter•17m ago•4 comments

VibeCodingBench: Benchmark Vibe Coding Models for Fun

https://twitter.com/yq_acc/status/2016201908181205358
1•jiayaoqijia•17m ago•1 comments

Former astronaut on lunar spacesuits: "I don't think they're great "

https://arstechnica.com/space/2026/01/former-astronaut-on-lunar-spacesuits-i-dont-think-theyre-gr...
1•rbanffy•18m ago•0 comments

How to Enable ProMotion 120Hz Mode in Safari (Mac, iPhone, and iPad)

https://birchtree.me/blog/how-to-enable-120hz-mode-in-safari-mac-iphone-and-ipad/
1•alwillis•20m ago•0 comments

37signals Isn't Smarter Than You, but They Are Different

https://www.nateberkopec.com/blog/37signals-is-not-smarter-than-you/
1•gaws•20m ago•0 comments

The Peptide Craze, a Surge in Use of Off-Label and Non-FDA Approved Peptides

https://erictopol.substack.com/p/the-peptide-craze
3•ck2•21m ago•1 comments

Bankers at Morgan Stanley are eviscerating Tesla's "robotaxi" performance

https://bsky.app/profile/niedermeyer.online/post/3mdg6hlruzk2o
4•doener•22m ago•0 comments

Will It Rain

https://rainycheck.com/
1•slowinthehead•23m ago•0 comments

Show HN: I built a tool that broke my 15-year doomscrolling habit in one week

https://tolerance.lol
1•wduncan•24m ago•1 comments

Maia 200: The AI accelerator built for inference – The Official Microsoft Blog

https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference/
1•rbanffy•25m ago•0 comments

Ask HN: After Brex's $5B exit, are Ramp customers misreading risk?

2•fintecheng•26m ago•0 comments

The GNU C Library is moving from Sourceware

https://lwn.net/Articles/1056206/
2•rascul•28m ago•0 comments

Show HN: Watermark – Browser based image/video watermarking with FFmpeg.wasm

https://watermark.akatski.com
1•a_void_sky•28m ago•0 comments