frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

https://github.com/MinishLab/semble
4•Bibabomas•1h ago
Hey HN! We (Stephan and Thomas) recently open-sourced Semble. We kept running into the same problem while using Claude Code on large codebases: when the agent can't find something directly, it falls back to grep, reading full files or launching subagents. This uses a lot of tokens, and often still misses the relevant code. There are existing tools for this, but they were either too slow to index on demand, needed API keys, or had poor retrieval quality.

Semble is our solution for this. It combines static Model2Vec embeddings (using our latest static model: potion-code-16M) with BM25, fused via RRF and reranked with code-aware signals. Everything runs on CPU since there's no transformers involved. On our benchmark of ~1250 query/document pairs across 63 repos and 19 languages, it uses 98% fewer tokens than grep+read and reaches 99% of the retrieval quality of a 137M-parameter code-trained transformer, while being ~200x faster.

Main features:

- Token-efficient: 98% fewer tokens than grep+read

- Fast: ~250ms to index a typical repo on our benchmark, ~1.5ms per query on CPU (very large repos may take longer)

- Accurate: 0.854 NDCG@10, 99% of the best transformer setup we tested

- MCP server: drop-in for Claude Code, Cursor, Codex, OpenCode

- Zero config: no API keys, no GPU, no external services

Install in Claude Code with: claude mcp add semble -s user -- uvx --from "semble[mcp]" semble

Or check our README for other installation instructions, benchmarks, and methodology:

Semble: https://github.com/MinishLab/semble

Benchmarks: https://github.com/MinishLab/semble/tree/main/benchmarks

Model: https://huggingface.co/minishlab/potion-code-16M

Let us know if you have any feedback or questions!

Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

https://github.com/MinishLab/semble
4•Bibabomas•1h ago•0 comments

Show HN: I made a printable graph papaer templates website

https://printablegraphpaper.org/
4•atharvtathe•2h ago•5 comments

Show HN: Rocksky – Music scrobbling and discovery on the AT Protocol

https://tangled.org/rocksky.app/rocksky
97•tsiry•1d ago•42 comments

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

https://github.com/cactus-compute/needle
763•HenryNdubuaku•4d ago•210 comments

Show HN: Watch a neural net learn to play Snake

https://ppo.gradexp.xyz/
192•c1b•3d ago•45 comments

Show HN: Forecasting my backyard weather with a 22M time-series model

https://huggingface.co/spaces/bitsofchris/time-series-ai-weather-forecast
3•chrisdevs•2h ago•0 comments

Show HN: Epiq – Distributed Git based issue tracker TUI

https://ljtn.github.io/epiq/
87•jolaflow•1d ago•46 comments

Show HN: Burn, baby, burn (those tokens)

https://github.com/dtnewman/burn-baby-burn
126•dtnewman•1d ago•28 comments

Show HN: Serene Bach – a Go weblog engine that runs as CGI or HTTP

https://github.com/serendipitynz/serenebach
3•takkyun•13h ago•0 comments

Show HN: Gigacatalyst – Extend your SaaS with an embedded AI builder

60•namanyayg•5d ago•27 comments

Show HN: Sx – an open-source package manager for AI skills, MCPs, and commands

https://github.com/sleuth-io/sx
48•detkin•2d ago•26 comments

Show HN: Running the second public ODoH relay

https://numa.rs/blog/posts/odoh-anonymous-dns-without-an-account.html
124•rdme•3d ago•41 comments

Show HN: TikTok but for scientific papers

https://andreaturchet.github.io/website/index.html
196•ciwrl•6d ago•77 comments

Show HN: Nibble

https://github.com/glouw/nibble
101•glouwbug•3d ago•24 comments

Show HN: Built a verifiable, open-source SoC 2 readiness scanner

https://loxeai.com
2•arjavmehta•17h ago•0 comments

Show HN: Browser based sythesizer, drum machine and squencer

https://github.com/madmonk13/modal-16
19•madmonk•1d ago•4 comments

Show HN: GridTravel – A community based travel app for users to share routes

https://www.gridtravel.app
59•knuaym9•2d ago•39 comments

Show HN: Agentic interface for mainframes and COBOL

https://www.hypercubic.ai/hopper
97•sai18•5d ago•50 comments

Show HN: Statewright – Visual state machines that make AI agents reliable

https://github.com/statewright/statewright
125•azurewraith•5d ago•55 comments

Show HN: Got ghosted by tech companies so I built a tool to track ghost jobs

https://csvfirst.pythonanywhere.com/insights/hiring-data/job-listings-that-stay-open-for-years/
6•ktmartin•20h ago•3 comments

Show HN: I built a screen recorder that captures console logs, requests and more

https://userplane.io/
2•wizenheimer•21h ago•0 comments

Show HN: Hermes-agentmemory, pull-model episodic memory with real deletes

https://github.com/MukundaKatta/hermes-agentmemory
4•mukundakatta•23h ago•0 comments

Show HN: Strava for AI coding – analytics on your Copilot/Claude/Codex usage

https://github.com/microsoft/AI-Engineering-Coach
8•aymenfurter•1d ago•1 comments

Show HN: Infinite Swap – Trade a bottle cap up to a house

https://infiniteswap.app/
6•dansquizsoft•1d ago•3 comments

Show HN: MIT OSS LinkedIn DMs for Agents (CLI and Example TUI)

https://allman.sh
5•toobulkeh•1d ago•1 comments

Show HN: I made a Clojure-like language in Go, boots in 7ms

https://github.com/nooga/let-go
289•marcingas•1w ago•85 comments

Show HN: TRUST – Coding Rust like it's 1989

https://github.com/wojtczyk/trust
177•wojtczyk•1w ago•87 comments

Show HN: A modern Music Player Daemon based on Rockbox firmware

https://github.com/tsirysndr/rockbox-zig
122•tsiry•1w ago•28 comments

Show HN: Rust but Lisp

https://github.com/ThatXliner/rust-but-lisp
216•thatxliner•1w ago•73 comments

Show HN: An index of indie web/blog indexes

https://theindex.fyi
154•rocketpastsix•1w ago•39 comments