1) The LLM-powered pipeline to extract citations (books + authors) from books and resolve them using both Wikipedia and Goodreads with offline copies I have. The result is data associating Books/Authors to other Books/Authors with accurate bibliographical information spanning centuries.
2) A WebGPU + D3.js powered visualization tool written by Claude Code so I'm able to deal with all this data on the browser on a more or less comfortable experience for the viewer.
I spent some months on a off with this project, and definitely the most challenging part was dealing with accurate bibliographical information across centuries, with original publication dates and etc. For that I wrote what is now a very complex pipeline with LLMs (I used DeepSeek V3.2) wired on offline Goodreads and Wikipedia databases + a fallback that actually uses the internet.
Hope you enjoy it! Open to suggestions on how to improve the system :)
Code is here: https://github.com/ThiagoLira/bookgraph-revisited