Reminds me of when dumbphones were introduced and people said things like why do I need to have a phone with me all the time.
This is what cellphones looked like, back then: https://share.google/z3bBbfhT43EHcDYoc
Cellphones actually were quite small, during the 1990s. I used to go to Japan, and they got downright tiny during that time.
Smartphones actually increased the size (but not to 1980s scale).
Meta has a model just for isolating speech in noisy environments (the “live captioning feature”) and it seems that’s also the main feature of the Aircaps glasses. Translation is a relatively solved problem. The issue is isolating the conversation.
I’ve found meta is pretty good about not overdelivering on promised features, and as a result even though they probably have the best hardware and software stack of any glasses, the stuff you can do with the Rayban displays are extremely limited.
Bear in mind that simultaneous interpretation by humans (eg with a headset at a meeting of an international organisation) has been a thing for decades.
The crux of it for me:
- if it's not a person it will be out of sync, you'll be stopping it every 10 sec to get the translation. One could as well use their phone, it would be the same, and there's a strong chance the media is already playing from there so having the translation embedded would be an option.
- with a person, the other person needs to understand when your translation in going on, and when it's over, so they know when to get an answer or know they can go on. Having a phone in plain sight is actually great for that.
- the other person has no way to check if your translation is completely out of whack. Most of the time they have some vague understanding, even if they can't really speak. Having the translation in the glasses removes any possible control.
There are a ton of smaller points, but all in all the barrier for a translation device to become magic and just work plugged in your ear or glasses is so high I don't expect anything beating a smartphone within my lifetime.
Aircaps demos show it to be pretty fast and almost real time. Meta's live captioning works really fast and is supposed to be able to pick out who is talking in a noisy environment by having you look at the person.
I think most of your issues are just a matter of the models improving themselves and running faster. I've found translations tend to not be out of whack, but this is something that can't really be solved except by having better translation models. In the case of Airpods live translate the app will show both people's text.
I see the real improvements in the models, for IRL translation I just think phones are very good at this and improving from there will be exponentially difficult.
IMHO it's the same for "bots" intervening (commenting/reacring on exchanges etc.) in meetings. Interfacing multiple humans in the same scene is always a delicate problem.
However you are limited in what you can do.
there are no speakers, which they pitch as a "simpler quieter interface" which is great but it means that _all_ your interactions are visual, even if they don't need to be.
I'm also not sure about the microphone setup, if you're doing voice assistant, you need beamforming/steering.
However, the online context in "conversate" mode is quite nice. I wonder how useful it is. they hint at proper context control "we can remember your previous conversations" but thats a largely unsolved problem on large machines, let alone on device.
For people who are prone to motion sickness, its also really useful to have it tied to the global frame. (I don't have that, fortunately)
Afaiu, the dashboard is positioned above you, so you have to tilt your head up to see it and it shouldn’t obstruct anything important in regular life.
It's a cool form factor but the built-in transcription, ai etc are not very well implemented and I cannot imagine a user viewing this as essential rather than a novelty gadget
Not really. You can build your own apps [1].
- we all get to use free LLM
- they'll learn to do it properly next time
So it's a win-win situation
So no.
I guess they could use a common “generic” form factor, that would allow prescription lenses to be ordered.
That said, this is really the ideal form factor (think Dale, in Questionable Content), and all headsets probably want to be it, when they grow up.
Mind you, I grew up in the handful-of-index-cards-and-memorise-the-damn-speech era.
What matters more is how they support different eye-distances (interpupillary distance, IPD).
For instance, the teleprompter is terrible and buggy when it tries to follow along based on voice. A simple clicker for moving forward in a text file would be better than how it currently works.
How many people say they lost interest due to ocular issues versus complaints that it’s just not useful?
Seriously. A simple file browser with support for text files only would be more useful than the finicky G1 apps.
Of course visual issues could occur for someone, but it’s so aggravating that the they can’t just put in some sort of customization for content properly
If it was just a heads-up display for android like xreal, but low power and wireless that might be cool for when I'm driving. But everyone wants to make AI glasses locked into their own ecosystem. Everyone wants to displace the smartphone, from the Rabbit R1 to the new ray-bans. It's impossible.
In the end this tech will all get democratized and open sourced anyways, so I have to hand it to Meta and others for throwing money around and doing all this free R&D for the greater good.
Now we're going to see people's eyes moving around like crazy.
Nope, just people in my regular life, colleagues, etc. having that little twitch in their eyes.
ktallett•2mo ago
geraldwhen•2mo ago
stavros•2mo ago
volemo•2mo ago
https://github.com/even-realities
ktallett•2mo ago
volemo•2mo ago
> We have now released a demo source code to show how you can build your own application (iOS/Android) to interact with G1 glasses.
> More interfaces and features will also be open soon to give you better control of the hardware.
With a link to the demo app just below that with a detailed explanation of the protocol and currently available features.
ktallett•2mo ago
j-bos•2mo ago
Thanks this was key in deciding whether to consider this brand at all.
KurSix•2mo ago