frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Local LLM compresses long prompts before they reach Claude – MCP server

https://github.com/base76-research-lab/token-compressor
1•base76•1h ago

Comments

base76•1h ago
I built a two-stage prompt compressor that runs entirely locally before your prompt hits any frontier model API.

  How it works:
  1. llama3.2:1b (via Ollama) compresses the prompt to its semantic minimum
  2. nomic-embed-text validates that the compressed version preserves the original meaning (cosine ≥ 0.85)
  3. If validation fails → original is returned unchanged. No silent corruption.

  When it actually helps:
  The effect is meaningful only on longer inputs. Short prompts are skipped entirely — no cost, no risk.

  ┌─────────────────────────────────┬────────────┬────────┐
  │              Input              │   Tokens   │ Saving │
  ├─────────────────────────────────┼────────────┼────────┤
  │ < 80 tokens                     │ skipped    │ 0%     │
  ├─────────────────────────────────┼────────────┼────────┤
  │ Academic abstract (207t)        │ 207 → 78   │ 62%    │
  ├─────────────────────────────────┼────────────┼────────┤
  │ Structured research doc (1116t) │ 1116 → 275 │ 75%    │
  ├─────────────────────────────────┼────────────┼────────┤
  │ Short command (4t)              │ skipped    │ 0%     │
  └─────────────────────────────────┴────────────┴────────┘

  If you're sending short one-liners, this won't help. If you're injecting long context, research text, or system prompts — it pays off from the first call.

  Known limitation:
  Cosine similarity is blind to negation. "way smaller" vs "way larger" scores 0.985. The LLM stage handles this by explicitly preserving negations and conditionals, but it's an open
  research question — tracked in issue #1.

  Install as MCP (Claude Code):
  {
    "mcpServers": {
      "token-compressor": {
        "command": "python3",
        "args": ["/path/to/token-compressor/mcp_server.py"]
      }
    }
  }

  Requires: Ollama + llama3.2:1b + nomic-embed-text

  Repo: https://github.com/base76-research-lab/token-compressor-
base76•1h ago
would love to hear what you say abot it

California Becomes Latest State to Weigh Balcony Solar Legislation

https://www.bloomberg.com/news/newsletters/2026-01-30/california-becomes-latest-state-to-weigh-ba...
1•bilsbie•35s ago•0 comments

Show HN: Audio Toolkit for Agents

https://github.com/shiehn/sas-audio-processor
1•stevehiehn•51s ago•0 comments

Archiving my tweets in my own blog

https://solmaz.io/x/2027708131254387017/
1•hosolmaz•56s ago•0 comments

Show HN: Chromectl – CLI to give an AI agent its own Chrome session

https://github.com/BartlomiejLewandowski/chromectl
1•bartek_gdn•2m ago•0 comments

Cursor built this 5 min 3 round Wordle

https://apps.apple.com/us/app/fastdle/id6739634096
1•triviatroy•3m ago•1 comments

Show HN: SkillMesh (role-based tool routing for Claude/Codex)

https://github.com/varunreddy/SkillMesh
2•VarunReddy023•4m ago•0 comments

Living with Hyperphantasia

https://www.theguardian.com/science/2026/feb/28/living-with-hyperphantasia
1•bookofjoe•4m ago•0 comments

Ask HN: What can people do that intelligent machines will not be able to do?

1•cs702•5m ago•0 comments

Show HN: Delta – A disk space analyzer that tracks where your disk space went

https://github.com/chuunibian/delta
1•zerfallen•5m ago•1 comments

Think of BigConfig Package as 'Helm for Everything'

https://www.bigconfig.it/use-cases/package/
1•amiorin•8m ago•0 comments

The Epstein Files and the Epstein Class

http://colabopad.blogspot.com/2026/03/the-epstein-files-and-epstein-class.html
2•Edmond•10m ago•0 comments

Background Agents

https://background-agents.com/
2•thebuilderjr•10m ago•0 comments

Show HN: Videolyti – Free video downloader with built-in AI transcription

https://videolyti.com/en
1•coder_decoder•11m ago•0 comments

NIST to introduce restrictions on non-US citizens

https://physicstoday.aip.org/news/nist-to-introduce-restrictions-on-non-us-citizens
1•bikenaga•11m ago•0 comments

Ask HN: Vibecoding feels like playing golf, wdyt?

2•julienreszka•13m ago•0 comments

Is Nvidia's post-Rubin roadmap shifting toward inference-first architectures?

https://www.buysellram.com/blog/nvidia-next-gen-feynman-beyond-training-toward-inference-sovereig...
1•jamesbsr•13m ago•1 comments

My Favorite 39C3 Talks

https://asindu.xyz/my-favorite-39c3-talks/
1•max_•16m ago•0 comments

Bolt.gives Introduces Free, Agentic AI Coding Platform

https://github.com/embire2/bolt.gives
2•embire2•16m ago•0 comments

Bad Thing Insurance – Coverage for alien abduction, rogue black holes, and AGI

https://badthing.xyz/
2•rooster666•17m ago•1 comments

Fast-Servers: An Interesting Pattern?

https://geocar.sdf1.org/fast-servers.html
2•signa11•18m ago•0 comments

Reverse engineering "Hello World" in QuickBasic 3.0

https://marnetto.net/2026/03/01/brun-hello-world
2•alberto-m•19m ago•1 comments

Driftwood – friendly AppImage manager for Linux

https://apps.lashman.live/driftwood/
1•bovermyer•20m ago•0 comments

Cielab Color Space

https://en.wikipedia.org/wiki/CIELAB_color_space
1•vinhnx•22m ago•0 comments

Show HN: Belora.ai – Generative AI Platform for Images, Art

https://www.belora.ai
1•tatefinn•23m ago•0 comments

Foods destroying rainforests, in one simple chart

https://www.vox.com/climate/480083/beef-agriculture-deforestation-amazon-rainforest
2•stared•24m ago•0 comments

Show HN: Veracity-Cryptographic data integrity proofs for AI compliance

https://veracity.resethiq.com
1•ResEthiq1•24m ago•0 comments

Show HN: Build a Website for DevOps Learning

https://devopsatlas.com/
1•joshuajebaraj•25m ago•0 comments

Show HN: Colnade – Type-Safe DataFrames for Python

https://github.com/jwde/colnade
1•jwde•26m ago•0 comments

How I approach vibe coding projects to make it not suck

1•bwooceli•27m ago•0 comments

Lil' Fun Langs' Guts

https://taylor.town/scrapscript-001
2•surprisetalk•29m ago•0 comments