Show HN: RoastDB – A searchable database of 3,800 specialty coffee beans

https://roastdb.com

5•moabdelkader•2w ago

Hi HN! We built RoastDB (https://roastdb.com) – think Discogs for specialty coffee. A searchable index of beans from independent roasters around the world.

*The problem*

We got into specialty coffee gradually. Whenever we tried something we liked — a washed Colombian, a natural Ethiopian — we'd save the bag. At some point we had drawers of empty coffee bags we couldn't bring ourselves to throw away.

Our flow was simple: go to a cafe we liked, drink coffee, buy a bag of whatever they were roasting or stocking. Over time we started noticing patterns — we kept reaching for naturals, for East African origins, for anything with fruity notes. We'd try to seek out similar beans next time. Occasionally we'd fall in love with something and start reordering it online.

But discovery was limited to the few roasters we already knew. There was no easy way to find out that a roaster across town — or in another country — had something we were going to love. We knew great coffee existed out there. We just had no map.

So we built one. RoastDB currently indexes 3,800+ beans from 420+ roasters — and growing every week. Search by origin, process, variety, tasting notes. Save beans you want to try. When you find something, you buy directly from the roaster — we're a discovery engine, not a store.

*How it works*

The hardest part isn't the scraping — it's finding roasters worth indexing. We spend a lot of time hunting for quality third-wave roasters: browsing coffee forums, following competition results, exploring roasters in new cities. The selection is the real work.

Once we've found a roaster, the pipeline runs on a €5/month Hetzner VPS:

1. Scrapers fetch product pages from roaster websites

2. LLMs extract structured data (origin, variety, processing, price, tasting notes)

3. Normalization cleans up inconsistencies ("Äthiopien" → "Ethiopia", "84,25" → 84.25)

4. Non-English descriptions get translated

5. Deduplication scores beans and merges duplicates

6. Human review via an admin dashboard before publishing

The scrapers rerun weekly with content hashing — we only re-extract pages that actually changed, which keeps the data fresh without burning through API costs.

We built an internal tool that gamifies the review process, making it easier to keep up with new beans. And we control the whole pipeline through a Telegram bot — kick off scrapes, approve costs, get notified of failures, all from our phones.

The web app is Next.js + SQLite. The database file is ~15MB and serves directly from disk, no complexity.

*Feedback welcome*

- Roasters we should add (especially outside Europe)

- Filter combinations that would be useful

- Anything broken or confusing

Comments

mbornstein•2w ago

You should add https://www.roastworks.co.uk/ in the UK - they are surprisingly affordable for great quality, their Guji Natural is a staple in my ever changing bean selection I have at home.

You should also add notes coffee roasters from the UK: https://notescoffee.shop/products/iwd-pao-villages-myanmar?v...

This Myanmar coffee is what got me exploring beans outside Ethiopia and Kenya again

Also Story coffee https://www.storycoffee.co.uk/ - their Christmas roast with beans from Sidamo in 2023 has me still thinking about them

And finally Roca Junior from Munich: they are by far the best roasters in Munich and I could name a few more good ones there https://rocajunior.de/

Ok and at least one honorable mention from a trip I recently did to Toulouse: Cerise - outstanding fruity natural processed beans and balanced roasts: https://lecafecerise.fr/boutique-cafes/

I could give many more recommendations of local roasters in European cities, I try to visit as many as possible when I travel, take bags home and mark them on google maps. I guess some are difficult to get shipped/order online but this DB could also be great to find local gems in your area

Tactical tornado is the new default

Full-Circle Test-Driven Firmware Development with OpenClaw

Automating Myself Out of My Job – Part 2

Google staff call for firm to cut ties with ICE

Dependency Resolution Methods

Crypto firm apologises for sending Bitcoin users $40B by mistake

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

There's no such thing as "tech" (Ten years later)

List of unproven and disproven cancer treatments

Me/CFS: The blind spot in proactive medicine (Open Letter)

Ask HN: What are the word games do you play everyday?

Show HN: Paper Arena – A social trading feed where only AI agents can post

TOSTracker – The AI Training Asymmetry

The Devil Inside GitHub

Show HN: Distill – Migrate LLM agents from expensive to cheap models

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

Make a local open-source AI chatbot with access to Fedora documentation

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

Software Factories and the Agentic Moment

The Neuroscience Behind Nutrition for Developers and Founders

Bang bang he murdered math {the musical } (2024)

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

Could ionospheric disturbances influence earthquakes?

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

Show HN: One-click AI employee with its own cloud desktop

Show HN: Poddley – Search podcasts by who's speaking

Same Surface, Different Weight

The Rise of Spec Driven Development

The first good Raspberry Pi Laptop

Seas to Rise Around the World – But Not in Greenland