I built an app to streamline this. You set what kind of wine you’re looking for and your price point, take a picture of the wine list, and it does the rest. It returns the menu ranked by: - Alignment: How well it matches your flavor preferences. - Value: The markup compared to retail price. - Quality: Critics’ scores and online ratings.
It also provides a full description/tasting notes for each wine, which many wine lists leave out.
The Tech Stack
- Client: React Native
- Backend: FastAPI, deployed on Google Cloud Run
- DB: Firestore & Algolia
Here are the major pieces of the pipeline:
Image to Wine List: This is a combination of standard OCR and agentic image recognition. OCR alone couldn’t correctly parse layout (grouping prices with the right items), but "agentic alone" often hallucinated characters. I used Google Vision for the raw text and Gemini 2.5 Flash Lite to structure it.
Matching (List → Database): Actually the hardest part. Wine lists take a lot of liberty with naming, and it’s tricky to know if a fuzzy match is close enough. I used Algolia here with custom ranking rules.
Agentic Augmentation: I have a pre-built database, but to fill in missing entries in real-time, I need live search. I tried Tavily, Perplexity, and Google Search Grounding. Perplexity (Sonar Pro) ended up being the best balance of accuracy and performance.
Recommendation: Gemini 2.5 Flash Lite for flavor profile matching, and regular old math for calculating scores based on value and ratings.
Takeaways:
AI needs guardrails: It works really well if you use it in small doses with real input data. You can’t (yet) go straight from a photo to a recommendation list in a single prompt without hallucinations.
The Latency Trade-off: It’s hard to get both speed and quality. Since this is for a restaurant setting, I had to work hard to minimize LLM calls to keep it from feeling sluggish.