On a more pleasant topic the original recipe sounds delicious, I may give it a try when the weather cools off a little.
I should have caught that, and there are probably other bugs too waiting to be found. That said, it's still a great recipe.
Edit: just saw the author's comment, I think I'm looking at the fixed page
LLMs are specifically good at a task like this because they can extract content from any webpage, regardless of it supports whatever standard that no one implements
Just hitting keywords for search? Many of them don't even have ads so I feel like that can't be it. Maybe referrals?
This is a requirement? I literally only browse the web with an ad blocker but I always assumed those sites had tons of ads.
LLMs are shifting that ecosystem (at least temporarily) and new revenue models will emerge. It'll take time to figure out. But we shouldn't artificially support a bad system just because it's the existing system.
Transitions are always awkward. In the meantime, I'm inclined to give people rope to experiment.
There are things that are not allowed. But here someone made a good point without any personal attacks. You silencing people is probably the least appropriate thing in this thread.
> make a meaningful contribution
What was not a meaningful contribution, mentioning the relevant schema? Saying using the LLM is bad for example because it is trained on our content without permission and payment? or that this steals from people who provide you content for free and make money from ads?
spegel -p "extract only the product reviews" > REVIEWS.mdI'm curious how you tackled that problem
I think most of it comes down to Flash-Lite being really fast, and the fact that I'm only outputting markdown, which is fairly easy and streams well.
Not the answer to your question but here's the prompt
The web has existed for long before javascript was around.
The web was useful for long before javascript was around.
I literally hate javascript -- not the language itself but the way it is used. It has enabled some pretty cool things, yes. But javascript is not required to make useful webpages.
I wonder if you could turn this into a chrome extension that at least filters and parses the DOM
Obviously, against wishes of these social networks, which want us to be addicted... I mean, engaged.
And I didn't even get the new job through LinkedIn, though it did yield one interview.
[1] Not the actual worst.
[2] Not the actual best.
But the question of Javascript remains
pretty much like all modern computing then, hey.
I'm not saying it's necessarily a good idea but perhaps a bad/fun idea that can inspire good ideas?
Some websites may still return some static upfront that could be usefully understood without JavaScript processing, but a lot don't.
That's not to say you need an LLM, there are projects like Puppeteer that are like headless browsers that can return the rendered HTML, which can then be sent through an HTML to Markdown filter. That would be less computationally intensive.
which was exactly my point
And wouldn't it be ironic if Gemini was used to strip ads from webpages?
In the rare cases where the model would jam on its own, this will likely already happen.
I think the ad blocker of the future will be a local LLM, small and efficient. Want to sort your timeline chronologically? Or want a different UI? Want some things removed, and others promoted? Hide low quality comments in a thread? All are possible with LLM in the middle, in either agent or proxy mode.
I bet this will be unpleasant for advertisers.
Is it though, when the LLM might mutate the recipe unpredictably? I can't believe people trust probabilistic software for cases that cannot tolerate error.
Also like, forget amounts, cook times are super important and not always intuitive. If you screw them up you have to throw out all your work and order take out.
And yes cook times are important but no, even for a human-written recipe you need the intuition to apply adjustments. A recipe might be written presuming a powerful gas burner but you have a cheap underpowered electric. Or the recipe asks for a convection oven but your oven doesn't have the feature. Or the recipe presumes a 1100W microwave but you have a 1600W one. You stand by the food while it cooks. You use a food thermometer if needed.
For one an AI generated recipe could be something that no human could possibly like, whereas the human recipe comes with at least one recommendation (assuming good faith on the source, which you're doing anyway LLM or not).
Also an LLM may generate things that are downright inedible or even toxic, though the latter is probably unlikely even if possible.
I personally would never want to spend roughly an hour or so making bad food from a hallucinated recipe wasting my ingredients in the process, when I could have spent at most 2 extra minutes scrolling down to find the recommended recipe to avoid those issues. But to each their own I guess.
Seems like most of the usual food blog plugins use it, because it allows search engines to report calories and star ratings without having to rely on a fuzzy parser. So while the experience sucks for users, search engines use the structured data to show carousels with overviews, calorie totals and stuff like that.
https://recipecard.io/blog/how-to-add-recipe-structured-data...
https://developers.google.com/search/docs/guides/intro-struc...
EDIT: Sure enough, if you look at the OPs recipe example, the schema is in the source. So for certain examples, you would probably be better off having the LLM identify that it's a recipe website (or other semantic content), extract the schema from the header and then parse/render it deterministically. This seems like one of those context-dependent things: getting an LLM to turn a bunch of JSON into markdown is fairly reliable. Getting it to extract that from an entire HTML page is potentially to clutter the context, but you could separate the two and have one agent summarise any of the steps in the blog that might be pertinent.
{"@context":"https://schema.org/","@type":"Recipe","name":"Slowly Braised Lamb Ragu ...https://github.com/browsh-org/browsh https://www.youtube.com/watch?v=HZq86XfBoRo
But yes, having some global shared redundant P2P cache (of the "raw" data), like IPFS (?) could possibly help and save some processing power and help with availability and data preservation.
The ultimate rose (or red, or blue or black ...) coloured glasses.-
Pounds of lamb become kilograms (more than doubling the quantity of meat), a medium onion turns large, one celery stalk becomes two, six cloves of garlic turn into four, tomato paste vanishes, we lose nearly half a cup of wine, beef stock gets an extra ¾ cup, rosemary is replaced with oregano.
> Sometimes you don't want to read through someone's life story just to get to a recipe... That said, this is a great recipe
I compared the list of ingredients to the screenshot, did a couple unit conversions, and these are the discrepancies I saw.
It's beyond parody at this point. Shit just doesn't work, but this fundamental flaw of LLMs is just waved away or simply not acknowledged at all!
You have an algorithm that rewrites textA to textB (so nice), where textB potentially has no relation to textB (oh no). Were it anything else this would mean "you don't have an algorithm to rewrite textA to textB", but for gen ai? Apparently this is not a fatal flaw, it's not even a flaw at all!
I should also note that there is no indication that this fundamental flaw can be corrected.
"Theoretical"? I think you misspelled "ubiquitous".
The recipe site was so long that it got truncated before being sent to the LLM. Then, based on the first 8000 characters, Gemini hallucinated the rest of the recipe, it was definitely in its training set.
I have fixed it and pushed a new version of the project. Thanks again, it really highlights how we can never fully trust models.
My next plan is to rewrite hyperlinks to provide a summary of the page on hover, or possibly to rewrite the hyperlinks to be more indicative of the content at the end of it(no more complaining about the titles of HN posts...). But, my machine isn't too beefy and I'm not sure how well that will work, or how to prioritize links on the page.
Serioisly though, looks like a novel fix for the problem that most terminal browsers face. Namely that terminals are text based, but the web, whilst it contains text, is often subdivided up in a way that only really makes sense graphically.
I wonder if a similar type of thing might work for screen readers or other accessibility features
I wonder if it could be adapted to render as gopher pages.
>Individuals burning energy using personal llm internet condoms to strips ads and bullshit from every pageload
Eventually there will be a project where volunteers use llms to harvest the real internet and “launder” both the copyright and content into some kind of pre-processed distributed shadow internet where things are actual useable, while being just as wrong as the real internet.
What a future.
looks very similar to a chrome extension i use for a similar goal: reader view - https://chromewebstore.google.com/detail/ecabifbgmdmgdllomnf...
Using a big cloud provider for this is madness.
They are pretty great at converting data between formats, but I always worry there's a small chance it changes the actual data in the output in some small but misleading way.
In theory this could be used for ad blocking; though more expensive and less efficient, but the idea is there.
So, it is a very curious idea, but we still have to find an appropriate use case.
But I feel it doesn't solve the main issue of terminal-based web browsing. Displaying HTML in the terminal is often kind of ugly and css-based fanciness does not work at all, but that can usually just be ignored. The main problem is javascript and dynamic content, which this approach just ignores.
So no real step forward for cli web browsing, imo.
My #1 usecase is fetching wikis on my hard drive and letting a local coding agent use it for creating plans.
qsort•7mo ago
A natural next step could be doing things with multiple "tabs" at once, e.g: tab 1 contains news outlet A's coverage of a story, tab 2 has outlet B's coverage, tab 3 has Wikipedia; summarize and provide references. I guess the problem at that point is whether the underlying model can support this type of workflow, which doesn't really seem to be the case even with SOTA models.
simedw•7mo ago
I was thinking of showing multiple tabs/views at the same time, but only from the same source.
Maybe we could have one tab with the original content optimised for cli viewing, and another tab just doing fact checking (can ground it with google search or brave). Would be a fun experiment.
wrsh07•7mo ago
nextaccountic•7mo ago
You should also have some way for the LLM to indicate there is no useful output because perhaps the page is supposed to be a SPA. This would force you to execute Javascript to render that particular page though
simedw•7mo ago
https://github.com/mozilla/readability
dotancohen•7mo ago
myfonj•7mo ago
(The fact that browsers nowadays are usually expected to represent something "pixel-perfect" to everyone with similar devices is utterly against the original intention.)
Yet the original idea was (due to the state of technical possibilities) primarily about design and interactivity. The fact that we now have tools to extend this concept to core language and content processing is… huge.
It seems we're approaching the moment when our individual personal agent, when asked about a new page, will tell us:
Because its "browsing history" will also contain a notion of what we "know" from chats or what we had previously marked as "known".ffsm8•7mo ago
For this to work like a user would want, the model would have to be sentient.
But you could try to get there with current models, it'd just be very untrustworthy to the point of being pointless beyond a novelty
myfonj•7mo ago
Naturally, »nothing new of interest for you« here is indeed just a proxy for »does not involve any significant concept that you haven't previously expressed knowledge about« (or how to put it), what seems pretty doable, provided that contract of "expressing knowledge about something" had been made beforehand.
Let's say that all pages you have ever bookmarked you have really grokked (yes, a stretch, no "read it later" here) - then your personal model would be able to (again, figuratively) "make qualified guess" about your knowledge. Or some kind of tag that you could add to any browsing history entry, or fragment, indicating "I understand this". Or set the agent up to quiz you when leaving a page (that would be brutal). Or … I think you got the gist now.
bee_rider•7mo ago
Or that I’m looking up a data point that I already actually know, just because I want to provide a citation.
But, it could be interesting.
myfonj•7mo ago
Or (and this is actually doable absolutely without any "AI" at all):
(There is one page nearby that would be quite unusable for me, had I not a crude userscript aid for this particular purpose. But I can imagine having a digest about "What's new here?" / "Noteworthy responses?" would be way better.)For the "I need to cite this source", naturally, you would want the "verbatim" view without any amendments anyway. Also probably before sharing / directing someone to the resource, looking at the "true form" would be still pretty necessary.
dotancohen•7mo ago
When I was a child we knew that the North Star consisted of five suns. Now we know that it is only three suns, and through them we can see another two background stars that are not gravitationally bound to the three suns of the Polaris system.
Maybe in my grandchildren lifetimes we'll know something else about the system.
idiotsecant•7mo ago
aspenmayer•7mo ago
So, you gonna “put on those sunglasses, or start chewing on that trashcan?” It’s a distinction without a difference!
https://www.youtube.com/watch?v=1Rr4mQiwxpA
baq•7mo ago
almost unrelated, but you can also compare spegel to https://www.brow.sh/
phatskat•7mo ago
I think the primary reason I use multiple tabs but _especially_ multiple splits is to show content from various sources. Obviously this is different that a terminal context, as I usually have figma or api docs in one split and the dev server on the other.
Still, being able to have textual content from multiple sources visible or quickly accessible would probably be helpful for a number of users
andrepd•7mo ago
TeMPOraL•7mo ago
I think this is basically what https://ground.news/ does.
(I'm not affiliated with them; just saw them in the sponsorship section of a Kurzgesagt video the other day and figured they're doing the thing you described +/- UI differences.)
doctoboggan•7mo ago
hliyan•7mo ago
But if we do it, we have to admit something hilarious: we will soon be using AI to convert text provided by the website creator into elaborate web experiences, which end users will strip away before consuming it in a form very close to what the creator wrote down in the first place (this is already happening with beautifully worded emails that start with "I hope this email finds you well").
npmipg•7mo ago