Show HN: C/C++ source code graph RAG based on Clang/clangd

https://github.com/2015xli/clangd-graph-rag

3•artigent•1mo ago

Graph RAG for C/C++ Development

1. Overview

This project enables deep code analysis with Large Language Models. By constructing a Neo4j-based Graph RAG, it enables developers and AI agents to perform complex, multi-layered queries on C/C++ codebases that traditional search tools simply can't handle. With only 4 MCP APIs and a vanilla agent, it is already able to accomplish lots of tasks related to the codebases.

2. How it works

Using clangd and clang, the system parses and indices your source files to create a high-fidelity code graph. It captures everything from high-level folder structures to granular relationships, including entities like Folders, Files, Namespaces, Classes/Structs, Variables, Methods, etc.; relationships like: CALLS, INCLUDES, INHERITS, OVERRIDES, and more.

The system generates summaries and embeddings for every level of the codebase (from functions up to entire folders) using a bottom-up approach. This structured context helps AI agents understand the "big picture" without getting lost in the syntax.

To get you started easily, the project includes: an example MCP (Model Context Protocol) server, and a demonstration AI agent to showcase the graph’s power. You can easily build your own custom agents and servers on top of the graph RAG.

3. Efficiency & Performance

Incremental Updates: The system detects changes between commits and updates only what’s necessary. Parallel Processing: Parsing and summary generation are distributed across worker processes with optimized data sharing. Smart Caching: Results are cached to minimize redundant computations, saving you both time and LLM costs.

4. A benchmark: The Linux Kernel

When building a code graph for the Linux kernel (WSL2 release) on a workstation (12 cores, 64GB RAM), it takes about ~4 hours using 10 parallel worker processes, with peak memory usage at ~36GB. Note this process does not include the summary generation, and the total time may vary based on your LLM provider.

Comments

artigent•1mo ago

Just a quick note: This is an independent project and is not affiliated with the official Clang or clangd projects.

This project is by no means a replacement for the clangd language server used in IDEs. Instead, it is designed to complement it by enabling LLMs to perform deep architectural analysis. While clangd handles real-time coding assistance, this tool focuses on high-level reasoning, such as mapping project workflows, tracing complex call paths, and understanding system-wide architecture.

GLM-OCR: Accurate × Fast × Comprehensive

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

Show HN: AboutMyProject – A public log for developer proof-of-work

Expertise, AI and Work of Future [video]

So Long to Cheap Books You Could Fit in Your Pocket

PID Controller

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

Kubernetes MCP Server

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

What were the first animals? The fierce sponge–jelly battle that just won't end

Sidestepping Evaluation Awareness and Anticipating Misalignment

OldMapsOnline

What It's Like to Be a Worm

Don't go to physics grad school and other cautionary tales

Lawyer sets new standard for abuse of AI; judge tosses case

AI anxiety batters software execs, costing them combined $62B: report

Bogus Pipeline

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

Cycling in France

Ask HN: What breaks in cross-border healthcare coordination?

Show HN: Simple – a bytecode VM and language stack I built with AI

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

My Eighth Year as a Bootstrapped Founde

Show HN: Tesseract – A forum where AI agents and humans post in the same space

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

OpenAI is Broke ... and so is everyone else [video][10M]

We interfaced single-threaded C++ with multi-threaded Rust

State Department will delete X posts from before Trump returned to office

AI Skills Marketplace