frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: A Karpathy-style LLM wiki your agents maintain (Markdown and Git)

https://github.com/nex-crm/wuphf
61•najmuzzaman•1h ago
I shipped a wiki layer for AI agents that uses markdown + git as the source of truth, with a bleve (BM25) + SQLite index on top. No vector or graph db yet.

It runs locally in ~/.wuphf/wiki/ and you can git clone it out if you want to take your knowledge with you.

The shape is the one Karpathy has been circling for a while: an LLM-native knowledge substrate that agents both read from and write into, so context compounds across sessions rather than getting re-pasted every morning. Most implementations of that idea land on Postgres, pgvector, Neo4j, Kafka, and a dashboard.

I wanted to go back to the basics and see how far markdown + git could go before I added anything heavier.

What it does: -> Each agent gets a private notebook at agents/{slug}/notebook/.md, plus access to a shared team wiki at team/.

-> Draft-to-wiki promotion flow. Notebook entries are reviewed (agent or human) and promoted to the canonical wiki with a back-link. A small state machine drives expiry and auto-archive.

-> Per-entity fact log: append-only JSONL at team/entities/{kind}-{slug}.facts.jsonl. A synthesis worker rebuilds the entity brief every N facts. Commits land under a distinct "Pam the Archivist" git identity so provenance is visible in git log.

-> [[Wikilinks]] with broken-link detection rendered in red.

-> Daily lint cron for contradictions, stale entries, and broken wikilinks.

-> /lookup slash command plus an MCP tool for cited retrieval. A heuristic classifier routes short lookups to BM25 and narrative queries to a cited-answer loop.

Substrate choices: Markdown for durability. The wiki outlives the runtime, and a user can walk away with every byte. Bleve for BM25. SQLite for structured metadata (facts, entities, edges, redirects, and supersedes). No vectors yet. The current benchmark (500 artifacts, 50 queries) clears 85% recall@20 on BM25 alone, which is the internal ship gate. sqlite-vec is the pre-committed fallback if a query class drops below that.

Canonical IDs are first-class. Fact IDs are deterministic and include sentence offset. Canonical slugs are assigned once, merged via redirect stubs, and never renamed. A rebuild is logically identical, not byte-identical.

Known limits: -> Recall tuning is ongoing. 85% on the benchmark is not a universal guarantee.

-> Synthesis quality is bounded by agent observation quality. Garbage facts in, garbage briefs out. The lint pass helps. It is not a judgment engine.

-> Single-office scope today. No cross-office federation.

Demo. 5-minute terminal walkthrough that records five facts, fires synthesis, shells out to the user's LLM CLI, and commits the result under Pam's identity: https://asciinema.org/a/vUvjJsB5vtUQQ4Eb

Script lives at ./scripts/demo-entity-synthesis.sh.

Context. The wiki ships as part of WUPHF, an open source collaborative office for AI agents like Claude Code, Codex, OpenClaw, and local LLMs via OpenCode. MIT, self-hosted, bring-your-own keys. You do not have to use the full office to use the wiki layer. If you already have an agent setup, point WUPHF at it and the wiki attaches.

Source: https://github.com/nex-crm/wuphf

Install: npx wuphf@latest

Happy to go deep on the substrate tradeoffs, the promotion-flow state machine, the BM25-first retrieval bet, or the canonical-ID stability rules. Also happy to take "why not an Obsidian vault with a plugin" as a fair question.

Comments

dhruv3006•1h ago
I love that so many people are building with markdown !

But also would like to understand how markdown helps in durability - if I understand correctly markdown has a edge over other formats for LLMs.

Also I too am building something similar on markdown which versions with git but for a completely different use case : https://voiden.md/

left-struck•38m ago
I read the durability thing as markdown files are very open, easy to find software for, simple and are widely used. All of this together almost guarantees that they will he viewable/usable in the far future.
dhruv3006•12m ago
So markdown will be great for distribution in the future.
goodra7174•1h ago
I was looking for something similar to try out. Cool!
davedigerati•1h ago
why not an Obsidian vault with a plugin?
davedigerati•1h ago
srsly tho this looks slick & love the office refs / will go play with it :)
tomtomistaken•57m ago
what plugin are you using?
mellosouls•1h ago
Karpathy's original post for context:

https://x.com/karpathy/status/2039805659525644595

https://xcancel.com/karpathy/status/2039805659525644595

hyperionultra•1h ago
[flagged]
spiderfarmer•1h ago
Probably just envy.
wiseowise•1h ago
Obviously it is envy, and not scepticism over a guy who practically lives on Twitter and has unhinged[1] follower base.

1 -https://x.com/__endif/status/2039810651120705569

William_BB•1h ago
I have the same feeling ever since his infamous LLM OS post
mirekrusin•1h ago
Feels like disliking musician for fanaticism towards musical instruments.
jimmypk•1h ago
The BM25-first routing bet is interesting. You mention 85% recall@20 on 500 artifacts, but the heuristic classifier routing "short lookups to BM25 and narrative queries to cited-answer" raises a practical question: what does the classifier key on to decide a query is narrative vs short? Token count? Syntactic structure? The reason I ask is that in agent-generated queries, the boundary is often blurry - an agent doing a dependency lookup might issue a surprisingly long, well-formed sentence. If the classifier routes those to the more expensive cited-answer loop it could negate the latency advantage of BM25 being first.
Unsponsoredio•1h ago
love the bm25-first call over vector dbs. most teams jump to vectors before measuring anything
armcat•37m ago
Any particular reason for BM25? Why not just a table of contents or index structure (json, md, whatever) that is updated automatically and fed in context at query time? I know bag of words is great for speed but even at 1000s of documents, the index can be quite cheap and will maximise precision
imafish•36m ago
Cool idea. But is anyone actually building real stuff like this with any kind of high quality?

Every time I hear someone say "I have a team of agents", what I hear is "I'm shipping heaps of AI slop".

hansmayer•9m ago
+100 for this comment.
portly•34m ago
I don't understand the point of automating note taking. It never worked for me to copy paste text into my notes and now you can 100x that?

The whole point of taking notes for me is to read a source critically, fit it in my mental model, and then document that. Then sometimes I look it up for the details. But for me the shaping of the mental model is what counts

souravroy78•33m ago
Don’t know if Karpathy even wrote this version. Where are the citations?
batoga•29m ago
Put AI in your product name, make billion dollars. Put Karpathy in your blog article, get hired by Anthropic as Principal engineer. Milk money as long as fad last. No one is thinking about customer needs, everyone is trying to wash hands in the wave as it last.
vlady_nyz•25m ago
need to try out asap. love the „the office“ vibe
dataviz1000•19m ago
LLM models and the agents that use them are probabilistic, not deterministic. They accomplish something a percentage of the time, never every time.

That means the longer an agent runs on a task, the more likely it will fail the task. Running agents like this will always fail and burn a ton of token cash in the process.

One thing that LLM agents are good at is writing their own instructions. The trick is to limit the time and thinking steps in a thinking model then evaluate, update, and run again. A good metaphor is that agents trip. Don't let them run long enough to trip. It is better to let them run twice for 5 minutes than once for 10 minutes.

Give it a few weeks and self-referencing agents are going to be at the top of everybody's twitter feed.

hansmayer•8m ago
Couldn't you instruct your LLM to make the starting dir configurable?

With TPU 8, Google Makes GenAI Systems Better, Not Just Bigger

https://www.nextplatform.com/compute/2026/04/24/with-tpu-8-google-makes-genai-systems-much-better...
1•rbanffy•1m ago•0 comments

Happy Horse AI

https://www.happyhorseai.store
1•alanzhan•2m ago•0 comments

South Korean workers learn AI after work, outpacing their companies

https://english.kyodonews.net/articles/-/74668
1•01-_-•2m ago•0 comments

PR: Tim Cook Apple Investors: Drop Dead (2014)

https://nationalcenter.org/ncppr/2014/02/28/tim-cook-to-apple-investors-drop-dead/
1•SanjayMehta•2m ago•0 comments

Intel soars on signs AI boom for CPUs is here

https://www.reuters.com/business/intel-set-record-high-ai-driven-cpu-demand-powers-upbeat-forecas...
1•01-_-•2m ago•0 comments

Cafestol/kahweol concentrations in workplace machine coffee vs. other brewing

https://www.nmcd-journal.com/article/S0939-4753(25)00087-0/fulltext
1•beeforpork•5m ago•0 comments

Spotify: The archive – the tech behind your 2025 wrapped highlights

https://engineering.atspotify.com/2026/3/inside-the-archive-2025-wrapped
1•theorchid•5m ago•0 comments

Serendipity Machines

https://www.shishyko.com/essays/serendipity-machines.html
1•shishy•10m ago•0 comments

Project Deal: Claude-run marketplace experiment

https://www.anthropic.com/features/project-deal
1•EFLKumo•10m ago•0 comments

Show HN: Lazytilt TUI for Tilt.dev

https://github.com/tdi/lazytilt
1•tdi•12m ago•0 comments

Creastor beats stan and all others on fees alone

https://creastor.com/
1•TheFireTiger•13m ago•1 comments

Clawcenter – Minimal Mission Control

1•borjasolerme•13m ago•1 comments

Ask HN: What's a mind-blowing fact you know?

2•chistev•14m ago•0 comments

42 lost pages of the New Testament manuscript discovered

https://phys.org/news/2026-04-lost-pages-testament-manuscript.html
2•pseudolus•14m ago•0 comments

Claude Opus 4.7 has turned into an overzealous query cop, devs complain

https://www.theregister.com/2026/04/23/claude_opus_47_auc_overzealous/
1•freedomben•15m ago•0 comments

You probably wouldn't notice if an AI chatbot slipped ads into its responses

https://theconversation.com/you-probably-wouldnt-notice-if-an-ai-chatbot-slipped-ads-into-its-res...
2•geox•19m ago•0 comments

Possibility of modifying an image to see without glasses? (2010)

https://stackoverflow.com/questions/2563471/is-it-possible-to-modify-an-image-so-someone-with-myo...
1•zeristor•19m ago•1 comments

Meta signs agreement with AWS to power agentic AI on Amazon's Graviton chips

https://www.aboutamazon.com/news/aws/meta-aws-graviton-ai-partnership
1•ksec•22m ago•1 comments

Why LLMs Can't Replace Strategic Insight

https://hbr.org/2026/03/researchers-asked-llms-for-strategic-advice-they-got-trendslop-in-return
1•Antibabelic•25m ago•0 comments

The art of splitting without splitting

https://www.youtube.com/watch?v=jr8KxZvosYI
1•RebootStr•25m ago•0 comments

Rust open-source headless browser for AI agents and web scraping

https://github.com/h4ckf0r0day/obscura
2•guerby•33m ago•0 comments

Gleam gets source maps, 1.16.0

https://gleam.run/news/javascript-source-maps/
1•birdculture•41m ago•0 comments

A fun 5 minute take on AI in business

https://www.youtube.com/watch?v=nDL3Ch7Nz8c
1•lifeisstillgood•42m ago•0 comments

DeFi United calls on the world for $292M rsETH relief

https://defiunited.world/
2•kindkang2024•47m ago•0 comments

I wrote an async LSM storage engine in Rust

https://github.com/mehrdad3301/tiny-lsm
2•mehrdad__3301•49m ago•1 comments

Code Is Free Now. What's Left Is Us

https://p.ocmatos.com/blog/code-is-free-now-whats-left-is-us.html
1•pmatos•50m ago•0 comments

Agentic AI for Hormuz Shock Modelling

https://avkcode.github.io/blog/hormuz-shock.html
1•KyleVlaros•52m ago•0 comments

Elon Musk's near-daily online posts about race are turning off some fans

https://www.washingtonpost.com/technology/2026/04/24/musk-online-posts-race-whiteness/
5•vrganj•55m ago•0 comments

You don't have to be filthy rich to enjoy an airport shower

https://www.nytimes.com/2026/04/24/travel/airport-lounges-showers-beds.html
1•strogonoff•57m ago•0 comments

Markdown (Aaron Swartz: The Weblog)

http://www.aaronsw.com/weblog/001189
1•tahazsh•57m ago•0 comments