frontpage.

I built this because I got frustrated with the current state of "local" RAG. It felt like I had to spin up a Docker container, configure a vector DB, and manage an ingestion pipeline just to let Claude ask questions about a few PDFs in a folder.

We seem to have turned "grep with semantics" into a microservices architecture problem.

What this is: local_faiss_mcp is a minimal implementation of the Model Context Protocol (MCP) that wraps FAISS and sentence-transformers. It runs entirely locally (no API keys, no external services) and connects to Claude Desktop via stdio.

How it works:

You run server.py (Claude runs this automatically via config).

It uses all-MiniLM-L6-v2 (on CPU) to embed text.

It stores the vectors in a flat FAISS index on disk alongside a JSON metadata file.

It exposes two tools to the LLM: ingest_document and query_rag_store.

The stack:

Python

mcp (Python SDK)

faiss-cpu

sentence-transformers

It’s intended for personal workflows (notes, logs, specs) where you want persistent memory for an agent without the infrastructure overhead.

Repo: https://github.com/nonatofabio/local_faiss_mcp

I’d love feedback on the implementation—specifically if anyone has ideas on better handling the chunking logic without bloating the dependencies, or if you run into performance issues with larger indices (10k+ vectors).

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Open-source framework for tracking prediction accuracy

India's Sarvan AI LLM launches Indic-language focused models

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

CReact Version 0.3.0 Released

Show HN: CReact – AI Powered AWS Website Generator

The rocky 1960s origins of online dating (2025)

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

Why there is no official statement from Substack about the data leak

Effects of Zepbound on Stool Quality

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

Kessler Syndrome Has Started [video]

Complex Heterodynes Explained

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

A Curated List of ML System Design Case Studies