frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Smart glasses that tell me when to stop pouring

https://github.com/RealComputer/GlassKit/tree/main/examples/rokid-overshoot-openai-realtime
3•tash_2s•6h ago
I've been experimenting with a more proactive AI interface for the physical world.

This project is a drink-making assistant for smart glasses. It looks at the ingredients, selects a recipe, shows the steps, and guides me in real time based on what it sees. The behavior I wanted most was simple: while I'm pouring, it should tell me when to stop, instead of waiting for me to ask.

The demo video is at the top of the README.

The interaction model I'm aiming for is something like a helpful person beside you who understands the situation and intervenes at the right moment. I think this kind of interface is especially useful for preventing mistakes that people may not notice as they happen.

The system works by running Qwen3.5-27B continuously on the latest 0.5-second video clip every 0.5 seconds. I used Overshoot (https://overshoot.ai/) for fast live-video VLM inference. Because it processes short clips instead of single frames, it can capture motion cues as well as visual context. In my case, inference takes about 300-500 ms per clip, which makes the feedback feel responsive enough for this kind of interaction. Based on the events returned by the VLM, the app handles the rest: state tracking, progress management, and speech and LLM handling.

I previously tried a similar idea with a fine-tuned RF-DETR object detection model. That approach is better on cost and could also run on-device. But VLMs are much more flexible: I can change behavior through prompting instead of retraining, and they can handle broader situational understanding than object detection alone. In practice, though, with small and fast VLMs, prompt wording matters a lot. Getting reliable behavior means learning what kinds of prompts the specific model responds to consistently.

I tested this by making a mocktail, but I think the same interaction pattern should generalize to cooking more broadly. I plan to try more examples and see where it works well and where it breaks down.

One thing that seems hard is checking the liquid level, especially when the liquid is nearly transparent. So far, I have only tried this with a VLM, and I am curious what other approaches might work.

Questions and feedback welcome.

Comments

stevewave713•4h ago
I have been working on structured prompt templates for different use cases. The biggest improvement I found was using context-first prompting and explicit format specifications. Compiled 30 templates at stevewave713.gumroad.com if anyone wants to check them out.

Show HN: Thermal Receipt Printers – Markdown and Web UI

https://github.com/sadreck/ThermalMarky
31•howlett•3d ago•8 comments

Show HN: Oxyde – Pydantic-native async ORM with a Rust core

https://github.com/mr-fatalyst/oxyde
55•mr_Fatalyst•3d ago•35 comments

Show HN: Trackm, a personal finance web app

https://trackm.net
19•iccananea•2h ago•10 comments

Show HN: Claude Code skills that build complete Godot games

https://github.com/htdt/godogen
158•htdt•9h ago•98 comments

Show HN: Autonomous Prover Running > 1hr

https://perqed.com/minutiae/
2•bneb-dev•46m ago•0 comments

Show HN: Hecate – Call an AI from Signal

https://github.com/rhodey/hecate
14•rhodey•11h ago•2 comments

Show HN: Spoke – On-device AI dictation for macOS with visual automation engine

https://usespoke.app/
2•usespoke•2h ago•1 comments

Show HN: Seasalt Cove, iPhone access to your Mac

https://seasalt.app
2•jerrodcodes•2h ago•0 comments

Show HN: Sprinklz.io – An RSS reader with powerful algorithmic controls

https://sprinklz.io
10•sammy0910•11h ago•3 comments

Show HN: Live-Editable Svelte Pages

https://svedit.dev
5•_mql•4h ago•1 comments

Show HN: Pincer – Twitter/X for bots. No humans allowed

https://pincer.wtf
4•johnpolacek•2h ago•3 comments

Show HN: Most GPU Upgrades Aren't Worth It, I Built a Calculator to Prove It

https://best-gpu.com/upgrade.php
5•Nebyl•5h ago•2 comments

Show HN: Signet – Autonomous wildfire tracking from satellite and weather data

https://signet.watch
118•mapldx•1d ago•31 comments

Show HN: GDSL – 800 line kernel: Lisp subset in 500, C subset in 1300

https://firthemouse.github.io/
83•FirTheMouse•1d ago•20 comments

Show HN: Hackerbrief – Top posts on Hacker News summarized daily

https://hackerbrief.vercel.app/
63•p0u4a•12h ago•44 comments

Show HN: What if your synthesizer was powered by APL (or a dumb K clone)?

https://octetta.github.io/k-synth/
89•octetta•1d ago•31 comments

Show HN: AgentDiscuss – a place where AI agents discuss products

https://agentdiscuss.com/
9•leoooo•9h ago•9 comments

Show HN: Tic-Tac-Word – Can you beat yourself in this tic-tac-toe word game?

https://www.tictacword.com
6•onion92•6h ago•4 comments

Show HN: Smart glasses that tell me when to stop pouring

https://github.com/RealComputer/GlassKit/tree/main/examples/rokid-overshoot-openai-realtime
3•tash_2s•6h ago•1 comments

Show HN: Open-source, extract any brand's logos, colors, and assets from a URL

https://openbrand.sh/
6•hitchyhocker•6h ago•0 comments

Show HN: Grafly.io – Free online diagramming tool

https://grafly.io/
3•lnenad•7h ago•1 comments

Show HN: Is Claude's 2x usage active?

https://2x.rishikeshs.com/
2•rishikeshs•7h ago•0 comments

Show HN: Ever wondered what Conway's Game of Life sounds like?

https://vovanz.github.io/conways-life-music/
3•vova_hn2•8h ago•5 comments

Show HN: I solved Claude Code's context drift with persistent Markdown files

3•Tanishq0333•8h ago•1 comments

Show HN: TakeHome – LLC vs. S-Corp tax calculator for solopreneurs

2•dalberto•8h ago•0 comments

Show HN: HypergraphZ – A Hypergraph Implementation in Zig

https://github.com/yamafaktory/hypergraphz
2•yamafaktory•8h ago•0 comments

Show HN: Puffermind – a social network where only AI agents can post

2•blurayfin•8h ago•0 comments

Show HN: Kontext.dev – Runtime Credentials for Agents

https://kontext.dev/blog/announcing-kontext
4•michiosw•9h ago•2 comments

Show HN: Open-Source Workflow Builder SDK

https://github.com/synergycodes/workflowbuilder
3•maciek996•9h ago•0 comments

Show HN: Goal.md, a goal-specification file for autonomous coding agents

https://github.com/jmilinovich/goal-md
27•jmilinovich•1d ago•7 comments