frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I extract recipes from TikTok, Instagram, and the messy web

2•sklaiber•1h ago
I kept losing recipes. You know how it goes — you're scrolling TikTok at midnight, see an amazing pasta dish, save it, and never find it again. So I built TasteBuddy to fix that for myself. What I didn't expect: parsing recipes from the internet is a rabbit hole that goes deep.

The thing is, recipe content is scattered everywhere in completely different formats. A food blog might have nice JSON-LD markup. A TikTok? Just someone talking over a video. An Instagram reel? Recipe buried in the comments. Pinterest? Links to blogs that died three years ago.

So I ended up building specialized extractors for each platform.

*Websites* are the "easy" case. I look for JSON-LD with `@type: Recipe` first — most food blogs have it, thanks to SEO plugins. But the real world is messy. I've seen duration fields as `PT30M`, `30 minutes`, `0:30`, and my personal favorite, just `half an hour`. About 30% of recipe URLs have no structured data at all, so I fall back to Gemini to make sense of the raw HTML.

*TikTok* is where it gets fun. There's no recipe API. My pipeline resolves short URLs, then checks if the creator says something like "link in bio" (I detect this in five languages because German food TikTok is surprisingly massive). If I can find their website, great — I scrape the actual recipe from there. If not, I download the video via Apify and let Gemini analyze the frames. It works, but it's slow and expensive, so that's a Pro-only feature.

*Instagram and Facebook* — similar deal. oEmbed gets me the image, but the recipe is usually in the caption or comments. Same link-in-bio detection, same website resolution.

*Photos* are actually straightforward — screenshot of a recipe, photo of a cookbook page, whatever. Gemini's vision model handles those surprisingly well.

*One thing I'm proud of: the AI tiering.* Not every task needs a big model.

- Gemini Flash Lite handles 90% of the work — classifying content ("is this even a recipe?"), parsing ingredients, extracting recipe names from social media captions. Cheap, fast, good enough. - Gemini Flash kicks in when structured data fails — parsing messy HTML, analyzing video frames, processing social media posts. - Gemini Pro only for image generation (recipe share cards). - text-embedding-004 for semantic search across your recipe collection.

This keeps my costs sane as a solo dev. Using Flash for everything would've been 10x more expensive with barely better results for the simple tasks.

*Stuff I learned the hard way:*

- JSON-LD in the wild is chaos. The spec is fine, but WordPress plugins are creative. - "Link in bio" is how recipe distribution actually works on social media. Detecting that pattern is more valuable than trying to parse a video. - AI as fallback beats AI as default. Structured data first, AI when it fails = 95%+ success at a fraction of the cost. - Tier your models aggressively. Don't throw dollars at a problem that cents can solve.

*Stack:* Flutter (just me, indie dev), Supabase (Postgres + Deno Edge Functions), Gemini, Apify, PostHog.

Free with a Pro tier for video extraction and household sharing.

Happy to go deeper on any part of the extraction pipeline.

https://taste-buddy.app

Maia 200: The AI accelerator built for inference

https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference/
1•MarlonPro•1m ago•0 comments

Gravity: Dynamically typed, embeddable programming language written in C

https://www.gravity-lang.org
1•klaussilveira•1m ago•0 comments

Power-User Utility to Recover, Export, Merge, Audit, and Sort Chrome Extensions

https://github.com/ZulfekarAliAgha/REMAS
1•zulali•2m ago•1 comments

Show HN: A compiled programming language for LLM-to-LLM communication [pdf]

https://sifsystemsmcrd.com/KL_White_Paper.pdf
1•tmbird•2m ago•1 comments

Show HN: See what your AI agents do under the hood

https://pingpulsehq.com
1•shafeeq2207•3m ago•0 comments

EPA to repeal its own conclusion that greenhouse gases warm the planet

https://www.nbcnews.com/science/climate-change/epa-to-repeal-endangerment-finding-climate-change-...
2•geox•3m ago•0 comments

Can you trust LastPass in 2026? Inside the quest to rebuild its security culture

https://www.zdnet.com/article/lastpass-2026-rebuilding-trust-ceo-interview/
3•arusahni•7m ago•0 comments

Show HN: Z-Image Base – Fast AI Image Generator (Open-Source, Free Tier)

https://z-imagebase.com/
1•chengai1106•7m ago•0 comments

When the Competition Is Down the Hall

https://k2xl.substack.com/p/when-the-competition-is-down-the
1•k2xl•8m ago•0 comments

The Banality of MAGA Evil

https://paulkrugman.substack.com/p/the-banality-of-maga-evil
5•rbanffy•9m ago•0 comments

Show HN: Onlybots.cam

https://onlybots.cam
1•m0rtyn•9m ago•0 comments

PostmarketOS at FOSDEM 2026 and Hackathon

https://postmarketos.org/blog/2026/02/10/fosdem-and-hackathon/
1•birdculture•9m ago•0 comments

How We Built the Fastest Kimi K2.5 on Artificial Analysis

https://www.baseten.co/blog/how-we-built-the-fastest-kimi-k2-5-on-artificial-analysis/
1•philipkiely•10m ago•0 comments

The Budget and Economic Outlook: 2026 to 2036

https://www.cbo.gov/publication/61882
1•mraniki•11m ago•1 comments

Web-Git-sum – Git is not GitHub

https://mitxela.com/projects/web-git-sum
1•moebrowne•15m ago•0 comments

Show HN: MEVA, a desktop Markdown reader for AI-generated docs

https://usemeva.com/
1•ss_meva•16m ago•0 comments

Trends in Prevalence of Autism by Adaptive and Intellectual Functioning Levels

https://onlinelibrary.wiley.com/doi/10.1002/aur.70167
1•hn_acker•17m ago•1 comments

Mamdani Hires Groundbreaking Computer Scientist as Chief Tech Officer

https://www.nytimes.com/2026/02/10/nyregion/mamdani-lisa-gelobter-gif.html
13•leephillips•18m ago•0 comments

Ask HN: Why electronics are still so unrecyclable?

2•alexandrehtrb•18m ago•0 comments

Stablecoins for Skeptics

https://news.alvaroduran.com/p/stablecoins-for-skeptics
1•ohduran•19m ago•1 comments

The Truth About No-KYC Crypto Cards, from Someone Who Ran One

https://twitter.com/defyneric/status/2021116183898886201
1•CrazyRobot•19m ago•0 comments

Who's the Agent Now?

https://danturkel.com/2026/02/11/agents.html
1•daturkel•19m ago•0 comments

Freenginx 1.29.5 Release

https://freenginx.org/en/CHANGES
1•neustradamus•20m ago•0 comments

Show HN: I built a tool to help generate short form videos

https://evokescenes.com/
1•delayedrelease•24m ago•2 comments

Show HN: SPICEBridge – MCP server for AI circuit design via ngspice

https://github.com/clanker-lover/spicebridge
1•clanker-lover•24m ago•0 comments

Blender source code was 9 files in January-8-1994

https://files.mastodon.social/media_attachments/files/115/825/585/900/044/589/original/b0c7ba495a...
2•marcodiego•24m ago•0 comments

The temporary closure of airspace over El Paso has been lifted

https://twitter.com/FAANews/status/2021583720465969421
2•lultimouomo•26m ago•1 comments

Sabotage Risk Report: Claude Opus 4.6 [pdf]

https://www-cdn.anthropic.com/f21d93f21602ead5cdbecb8c8e1c765759d9e232.pdf
1•rootforce•27m ago•0 comments

Chowla conjecture on the minimum of a cosine series

https://www.johndcook.com/blog/2026/02/07/chowla/
1•ibobev•27m ago•0 comments

Fibonacci numbers and time-space tradeoffs

https://www.johndcook.com/blog/2026/02/08/time-space-tradeoffs/
2•ibobev•27m ago•0 comments