Reading across books with Claude Code

https://pieterma.es/syntopic-reading-claude/

148•gmays•3w ago

Comments

jszymborski•3w ago

This is all interesting, however I find myself most interested in how the topic tree is created. It seems super useful for lots of things. Anyone can point me to something similar with details?

EDIT: Whoops, I found more details at the very end of the article.

alansaber•3w ago

He asks g2.5 flash to assign a topic. I am also interested in the best way to develop a general schema- there is a good deal of literature on this but nothing stands out, I think the standard approach is open ended classification generation using a single model then binning. Actually the novelty in his approach is first asking if a chunk is useful (ie adding a filter for non-semantic information) which I would normally do at the dataset creation stage.

duck•3w ago

Discussed earlier this week: https://news.ycombinator.com/item?id=46567400

kylehotchkiss•3w ago

In several years, IMO the most interesting people are going to be the ones still actually reading paper books and not trying to shove everything into a LLM

hungryhobbit•3w ago

I don't think the Venn diagram of those people and everyone else is as separate as you imagine.

I'm a Literature major and avid reader, but projects like this are still incredibly exciting to me. I salivate at the thought of new kinds of literary analysis that AI is going to open up.

imdsm•3w ago

the people most likely to analyse books like this are those of us who are more likely to read them as well

pradmatic•3w ago

Sure but those people don't have to be mutually exclusive. At the very least, a tool like this can help me decide what to read next.

alansaber•3w ago

I can't wait to have the LLM autopilot my neuralink whilst i'm in VR mario kart.

fatherwavelet•3w ago

I still read a lot of books and I use LLMs all the time. I have even got a bunch of book recommendations from LLMs. Imagine that. You actually have agency over these tools. I know it is hard to believe for some.

gulugawa•3w ago

[flagged]

gjm11•3w ago

I agree that we should be reading books with our eyes and that feeding a book into an LLM doesn't constitute reading it and confers few of the same benefits.

But this thing isn't (so far as I can tell) even slightly proposing that we feed books into an LLM instead of reading them. It looks to me more like a discovery mechanism: you run this thing, it shows you some possible links between books, and maybe you think "hmm, that little snippet seems well written" or "well, I enjoyed book X, let's give book Y a try" or whatever.

I don't think it would work particularly well for me; I'd want longer excerpts to get a sense of whether a book is interesting, and "contains a fragment that has some semantic connection with a fragment of a book I liked" doesn't feel like enough recommendation. Maybe it is indeed a huge waste of time. But if it is, it isn't because it's encouraging people to substitute LLM use for reading.

imdsm•3w ago

commenter above probably didn't read the post, ironically

ryan_n•3w ago

Guess we need “reading across hacker news articles with Claude code.”

gulugawa•2w ago

The ideal way to find similarities between two books is to read both of them. If an LLM is finding links between two books, that means that the LLM read both of the books.

To determine if a book is worth reading, I think it's better to ask someone for their recommendation or look at online reviews.

mikkupikku•3w ago

I zgrep my epubs, is that a problem too?

stavros•3w ago

I need a name for people who dismiss an entirely new and revolutionary class of technology without even trying it, so much so that they'll not even read about any new ideas that involve it.

imdsm•3w ago

we call them luddites

lsaferite•3w ago

I'm not entirely sure that's a fair association. The Luddites weren't against technology in general, they were fighting for their livelihoods. There very well could be a fresh luddite movement centered around the use of AI tools, but I don't think "luddite" is the right term in this specific case.

ironbound•3w ago

No that was a labor issue, abusive factory owners got targeted.

dang•3w ago

The HN guidelines include the term "curmudgeonly", which IMO is fair.

dredmorbius•2w ago

I've of course seen this note many times, but am inspired to seek the word's source. "Curmudgeon" has an interesting etymology --- unknown origin though a few possibilities and false starts:

<https://www.etymonline.com/word/curmudgeon>

smakt•2w ago

And I need a name for shills that handwave the whole magic thinking in a blog post and conclude with "oh my claude code pointed out correlations between atlas shrugged and steve jobs" I'm so much smarter and ready for the future that's coming.

You are damn right I didn't try it out. I try things published in journals, vetted by peers, with clear explanations and instructions. On the other hand, when the tone is "It's All Magic Sprinkle(TM)" my pseudoscience alarm goes off.

stavros•2w ago

Why are you reading this comment section? Nothing here has been peer reviewed. In fact, all my comments here are written by an LLM, because I can't be bothered arguing with closed-minded people.

smakt•2w ago

> Nothing here has been peer reviewed

Oh but everything here is peer reviewed all right: it's sheep-reviewed. All sheep singing the same note. Where's the explosion of groundbreaking, uber-creative, world-shattering, reliable software from MagicDust LLMs that turn you into a 10x engineer? If anything, it generates a lot of noise. Tell you what: being 10x more productive with a statistical engine that will only bring out the most normal of normal solutions is the dream of the incompetent.

dang•3w ago

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

"Don't be curmudgeonly. Thoughtful criticism is fine, but please don't be rigidly or generically negative."

https://news.ycombinator.com/newsguidelines.html

voidhorse•3w ago

This was posted before and there were many good criticisms raised in the comments thread.

I'd just reiterate two general points of critique:

1. The point of establishing connections between texts is semantic and terms can have vastly different semantic meanings dependent on the sphere of discourse in which they occur. Because of the way LLMs work, the really novel connections probably won't be found by an LLM since the way they function is quite literally to uncover what isn't novel.

2. Part of the point in making these connections is the process that acts on the human being making the connections. Handing it all off to an LLM is no better than blindly trusting authority figures. If you want to use LLMs as generators of possible starting points or things to look at and verify and research yourself, that seems totally fine.

smakt•3w ago

One has to be the special kind of stupid that is blinded by efficiency promises from the LLM Church to think the article is any worth.

It's the usual jargon soup. Publish a vetted paper with repeatable steps instead of a hyped-up, garbage, supposed 100x productivity bomb.

And his best result is mechanical findings from where the LLM got the highest correlations between its vectors: Bravo; there's always going to be a top item in any ordered list, but it doesn't make it automatically interesting. Reading literature is about witnessing the journey the characters take. Reading technical material is about memorizing enough of it. In both cases the material has to go through a brain. I find it idiotic to assign any value to outputs like "Oh King Lear's X is highly correlated to Antigone's Y"

skeptrune•3w ago

I really like the idea of the topic tree. That intuitively resonates.

ebiester•3w ago

I did a similar thing with productivity books early last year, but never released it because it wasn't high enough quality. I keep meaning to get back to that project but it had a much more rigid hypothesis in mind - trying to get the kind of classification from this is pretty difficult and even more so to get high value from it.

doytch•3w ago

The mental model I had of this was actually on the paragraph or page level, rather than words like the post demos. I think it'd be really interesting if you're reading a take on a concept in one book and you can immediately fan-out and either read different ways of presenting the same information/argument, or counters to it.

Ronsenshi•3w ago

For me this looks like a great way to build connections between books in order to create a recommendation engine - something better than what Goodreads & Co provides. Something actually useful.

The cost of indexing using third party API is extremely high, however. This might work out well with an open source model and a cluster of raspberry pi for large library indexing?

padolsey•3w ago

The incumbants Goodreads and their owner Amazon have indeed done such a poor job at this. Seven years ago I tried creating a basic graph using collaborative-filtering (effectively using our actual reading patterns as the embeddings space instead of semantics [human X likes book Y so likers of Y might like other things that human X has enjoyed]). It works well to this day (ablf.io) but the codebase is so ugly I've not had the bravery to update its data in a couple of years.

alansaber•3w ago

Yes imo this is very useful but there's not a clear industry standard on how to do so yet, which I imagine will change? Tell me if i'm missing something

nubskr•3w ago

I've been using Claude Code for my research notes and had the same realization, it's less about perfecting prompts and more about building tools so it can surprise you. The moment I stopped treating it like a function and started treating it like a coworker who reads at 1000 wpm, everything clicked

lloydatkinson•3w ago

How can anyone even trust crap like this? It was only a few days ago Claude and ChatGPT hallucinated a bunch of stuff from actual docs I sent them links to. When asked about it, they just apologised.

mpalmer•3w ago

Synthesizing 500 words at a time into digestible topics is significantly less prone to error. You're giving it a lot of info and asking for an organized subset. It's good at following such direction.

In your example, you're doing the inverse (give me a lot of text based on a little), and that's where LLMs have no problem hallucinating the new information.

alansaber•3w ago

Exactly the more tightly scoped the problem the less stochastic noise. Even better if you can add more signals based on deterministic algorithms like keyword presence etc. It gets very domain-specific very fast

zkmon•3w ago

I used AI for accelerating my reading a book recently. This is a interesting usecase. But it same as racing for the destination instead enjoying the journey.

It kills the tone, pace and the expressions of the author. It is pretty much same as an assistant summarizing the whole book for you, if that's what you want. It misses the entire experience delivered by the author.

alansaber•3w ago

Yes AI subsumes edge cases to produce a very uniform optimal writing (what we call AI slop). I am assuming this is a book you were reading for knowledge work, not for fun? Not heard about people recreationally using AI for consumer content that's a bridge too far for me lol.

CuriouslyC•3w ago

It's not optimal. It's overwritten, repetitive, cliche and increasingly incoherent over longer generations. I say this as someone who likes AI and uses it to create rough drafts and structural revisions of my ideas.

alansaber•3w ago

Exactly stochastic but statistically optimal based on a bunch of very broad range of text which often is not actually good writing

rbbydotdev•3w ago

I had a similar toy project. Attempting to make custom day trips from guide books. I immediately ran into limitations naïvely chunking paragraphs into a RAG. My next attempt I’m going to try using a llm model to extract “entities” like holidays/places/history and store them in a graph db coupled with vectors and original source text or index references(page + column)

Still experimental and way outside my expertise, would love to hear anyone with ideas or experience with this kind of problem

imranq•2w ago

I really liked the approach of getting new topics to research via embeddings, trails, and claude code, but often what will this give you outside of novelty?

Show HN: A unique twist on Tetris and block puzzle

The logs I never read

How to use AI with expressive writing without generating AI slop

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy

At Age 25, Wikipedia Refuses to Evolve

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro

.72% Variance Lance

ReKindle – web-based operating system designed specifically for E-ink devices

Encrypt It

NextMatch – 5-minute video speed dating to reduce ghosting

Personalizing esketamine treatment in TRD and TRBD

SpaceKit.xyz – a browser‑native VM for decentralized compute

NotebookLM: The AI that only learns from you

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

Game Boy Advance d-pad capacitor measurements

South Korean crypto firm accidentally sends $44B in bitcoins to users

Apache Poison Fountain

Web.whatsapp.com appears to be having issues syncing and sending messages

Google in Your Terminal

Show HN: A unique twist on Tetris and block puzzle

The logs I never read

How to use AI with expressive writing without generating AI slop

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy

At Age 25, Wikipedia Refuses to Evolve

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro

.72% Variance Lance

ReKindle – web-based operating system designed specifically for E-ink devices

Encrypt It

NextMatch – 5-minute video speed dating to reduce ghosting

Personalizing esketamine treatment in TRD and TRBD

SpaceKit.xyz – a browser‑native VM for decentralized compute

NotebookLM: The AI that only learns from you

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

Game Boy Advance d-pad capacitor measurements

South Korean crypto firm accidentally sends $44B in bitcoins to users

Apache Poison Fountain

Web.whatsapp.com appears to be having issues syncing and sending messages

Google in Your Terminal

Reading across books with Claude Code

Comments