I'm a Literature major and avid reader, but projects like this are still incredibly exciting to me. I salivate at the thought of new kinds of literary analysis that AI is going to open up.
But this thing isn't (so far as I can tell) even slightly proposing that we feed books into an LLM instead of reading them. It looks to me more like a discovery mechanism: you run this thing, it shows you some possible links between books, and maybe you think "hmm, that little snippet seems well written" or "well, I enjoyed book X, let's give book Y a try" or whatever.
I don't think it would work particularly well for me; I'd want longer excerpts to get a sense of whether a book is interesting, and "contains a fragment that has some semantic connection with a fragment of a book I liked" doesn't feel like enough recommendation. Maybe it is indeed a huge waste of time. But if it is, it isn't because it's encouraging people to substitute LLM use for reading.
To determine if a book is worth reading, I think it's better to ask someone for their recommendation or look at online reviews.
You are damn right I didn't try it out. I try things published in journals, vetted by peers, with clear explanations and instructions. On the other hand, when the tone is "It's All Magic Sprinkle(TM)" my pseudoscience alarm goes off.
Oh but everything here is peer reviewed all right: it's sheep-reviewed. All sheep singing the same note. Where's the explosion of groundbreaking, uber-creative, world-shattering, reliable software from MagicDust LLMs that turn you into a 10x engineer? If anything, it generates a lot of noise. Tell you what: being 10x more productive with a statistical engine that will only bring out the most normal of normal solutions is the dream of the incompetent.
"Don't be curmudgeonly. Thoughtful criticism is fine, but please don't be rigidly or generically negative."
I'd just reiterate two general points of critique:
1. The point of establishing connections between texts is semantic and terms can have vastly different semantic meanings dependent on the sphere of discourse in which they occur. Because of the way LLMs work, the really novel connections probably won't be found by an LLM since the way they function is quite literally to uncover what isn't novel.
2. Part of the point in making these connections is the process that acts on the human being making the connections. Handing it all off to an LLM is no better than blindly trusting authority figures. If you want to use LLMs as generators of possible starting points or things to look at and verify and research yourself, that seems totally fine.
It's the usual jargon soup. Publish a vetted paper with repeatable steps instead of a hyped-up, garbage, supposed 100x productivity bomb.
And his best result is mechanical findings from where the LLM got the highest correlations between its vectors: Bravo; there's always going to be a top item in any ordered list, but it doesn't make it automatically interesting. Reading literature is about witnessing the journey the characters take. Reading technical material is about memorizing enough of it. In both cases the material has to go through a brain. I find it idiotic to assign any value to outputs like "Oh King Lear's X is highly correlated to Antigone's Y"
The cost of indexing using third party API is extremely high, however. This might work out well with an open source model and a cluster of raspberry pi for large library indexing?
In your example, you're doing the inverse (give me a lot of text based on a little), and that's where LLMs have no problem hallucinating the new information.
It kills the tone, pace and the expressions of the author. It is pretty much same as an assistant summarizing the whole book for you, if that's what you want. It misses the entire experience delivered by the author.
Still experimental and way outside my expertise, would love to hear anyone with ideas or experience with this kind of problem
jszymborski•3w ago
EDIT: Whoops, I found more details at the very end of the article.
alansaber•3w ago