frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
243•isitcontent•16h ago•27 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
346•vecti•19h ago•153 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
310•eljojo•19h ago•192 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
2•sam256•57m ago•1 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
5•sakanakana00•1h ago•1 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•1h ago•0 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
77•phreda4•16h ago•14 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
93•antves•1d ago•70 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
17•denuoweb•2d ago•2 comments

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

https://www.biotradingarena.com/hn
26•dchu17•21h ago•12 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
49•nwparker•1d ago•11 comments

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

https://github.com/artifact-keeper
152•bsgeraci•1d ago•64 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
2•melvinzammit•4h ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•4h ago•2 comments

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode
19•NathanFlurry•1d ago•9 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
10•michaelchicory•6h ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
15•keepamovin•7h ago•5 comments

Show HN: Daily-updated database of malicious browser extensions

https://github.com/toborrm9/malicious_extension_sentry
14•toborrm9•21h ago•7 comments

Show HN: Horizons – OSS agent execution engine

https://github.com/synth-laboratories/Horizons
23•JoshPurtell•1d ago•5 comments

Show HN: Micropolis/SimCity Clone in Emacs Lisp

https://github.com/vkazanov/elcity
172•vkazanov•2d ago•49 comments

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

https://rahuljaguste.github.io/Nethack_Falcons_Eye/
5•rahuljaguste•16h ago•1 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
2•devavinoth12•9h ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
4•ambitious_potat•10h ago•4 comments

Show HN: Local task classifier and dispatcher on RTX 3080

https://github.com/resilientworkflowsentinel/resilient-workflow-sentinel
25•Shubham_Amb•1d ago•2 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
2•rs545837•11h ago•1 comments

Show HN: A password system with no database, no sync, and nothing to breach

https://bastion-enclave.vercel.app
12•KevinChasse•21h ago•16 comments

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

https://github.com/AGDNoob/FastLog
5•AGDNoob•12h ago•1 comments

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

https://github.com/SawyerHood/gitclaw
10•sawyerjhood•22h ago•0 comments

Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update

https://github.com/shadowy-pycoder/go-http-proxy-to-socks
2•shadowy-pycoder•13h ago•0 comments

Show HN: Craftplan – I built my wife a production management tool for her bakery

https://github.com/puemos/craftplan
568•deofoo•5d ago•166 comments
Open in hackernews

Show HN: Lumina – Open-source observability for LLM applications

https://github.com/use-lumina/Lumina
6•iggycodexs•1w ago
Hey HN! I built Lumina – an open-source observability platform for AI/LLM applications. Self-host it in 5 minutes with Docker Compose, all features included.

The Problem:

I've been building LLM apps for the past year, and I kept running into the same issues: - LLM responses would randomly change after prompt tweaks, breaking things - Costs would spike unexpectedly (turns out a bug was hitting GPT-4 instead of 3.5) - No easy way to compare "before vs after" when testing prompt changes - Existing tools were either too expensive or missing features in free tiers

What I Built:

Lumina is OpenTelemetry-native, meaning: - Works with your existing OTEL stack (Datadog, Grafana, etc.) - No vendor lock-in – standard trace format - Integrates in 3 lines of code

Key features: - Cost & quality monitoring – Automatic alerts when costs spike or responses degrade - Replay testing – Capture production traces, replay them after changes, see diffs - Semantic comparison – Not just string matching – uses Claude to judge if responses are "better" or "worse" - Self-hosted tier – 50k traces/day, 7-day retention, ALL features included (alerts, replay, semantic scoring)

How it works:

Start Lumina

git clone https://github.com/use-lumina/Lumina cd Lumina/infra/docker docker-compose up -d

// Add to your app (no API key needed for self-hosted!)

import { Lumina } from '@uselumina/sdk';

const lumina = new Lumina({ endpoint: 'http://localhost:8080/v1/traces', });

// Wrap your LLM call const response = await lumina.traceLLM( async () => await openai.chat.completions.create({...}), { provider: 'openai', model: 'gpt-4', prompt: '...' } );

That's it. Every LLM call is now tracked with cost, latency, tokens, and quality scores.

What makes it different:

1. Free self-hosted with limits that work – 50k traces/day and 7-day retention (resets daily at midnight UTC). All features included: alerts, replay testing, semantic scoring. Perfect for most development and small production workloads. Need more? Upgrade to managed cloud.

2. OpenTelemetry-native – Not another proprietary format. Use standard OTEL exporters, works with existing infra. Can send traces to both Lumina AND Datadog simultaneously.

3. Replay testing – The killer feature. Capture 100 production traces, change your prompt, replay them all, get a semantic diff report. Like snapshot testing for LLMs.

4. Fast – Built with Bun, Postgres, Redis, NATS. Sub-500ms from trace to alert. Handles 10k+ traces/min on a single machine.

What I'm looking for:

- Feedback on the approach (is OTEL the right foundation?) - Bug reports (tested on Mac/Linux/WSL2, but I'm sure there are issues) - Ideas for what features matter most (alerts? replay? cost tracking?) - Help with the semantic scorer (currently uses Claude, want to make it pluggable)

Why open source:

I want this to be the standard for LLM observability. That only works if it's: - Free to use and modify (Apache 2.0) - Easy to self-host (Docker Compose, no cloud dependencies) - Open to contributions (good first issues tagged)

The business model is managed hosting for teams who don't want to run infrastructure. But the core product is and always will be free.

Try it: - GitHub: https://github.com/use-lumina/Lumina - Demo video: [YouTube link] - Docs: https://docs.uselumina.io - Quick start: 5 minutes from `git clone` to dashboard

I'd love to hear what you think! Especially interested in: - What observability problems you're hitting with LLMs - Missing features that would make this useful for you - Any similar tools you're using (and what they do better)

Thanks for reading!

Comments

kxbnb•1w ago
Nice execution on the replay testing with semantic diff - that's a pain point that's hard to solve with just metrics.

One thing I've noticed building toran.sh (HTTP-level observability for agents): there's a gap between "what the agent decided to do" (your trace level) and "what actually went over the wire" (raw requests/responses). Especially with retries, timeouts, and provider failovers - the trace might show success but the HTTP layer tells a different story.

Do you capture the underlying HTTP calls, or is it primarily at the SDK/trace level? Asking because debugging often ends up needing both views.

Evanson•1w ago
Thanks, and great point. Right now, Lumina is mainly SDK/trace-level (what the app thinks happened: tokens, cost, latency, outputs), so you’re right that low-level HTTP details like retries/timeouts/failovers can be partially hidden. Capturing the raw HTTP layer alongside traces is on our roadmap because production debugging often needs both views. Also, your “see what your agent is actually doing” angle is spot-on. There’s a lot of opaque magic in agent frameworks. Curious how you’re doing it in toran.sh proxy/intercept, or wrapping the SDK HTTP client?