Alterego: Thought to Text

https://www.alterego.io/

193•oldfuture•5mo ago

Comments

oldfuture•5mo ago

learn more on:

https://www.media.mit.edu/projects/alterego/overview/

adding also their press release here:

https://docsend.com/view/dmda8mqzhcvqrkrk/d/fjr4nnmzf9jnjzgw

lukebechtel•5mo ago

been waiting for something like this. Looking forward to adoption!

pedalpete•5mo ago

I'd love to get a better understanding of the technology this is built with (without sitting through an exceedingly long video).

I suspect it's EMG though muscles in the ear and jaw bone, but that seems too rudimentary.

The TED talk describes a system which includes sensors on the chin across the jaw bone, but the demo obviously has removed that sensor.

fxwin•5mo ago

i think this is what you're looking for: https://www.media.mit.edu/projects/alterego/publications/

jackthetab•5mo ago

Thirteen minutes is an "exceedingly long video"?! Man, I thought I was jaded complaining about 20 minute videos! :-)

I want to know is what are the connected to? A laptop? A AS400? An old Cray they have lying around? I'd think doing the demo while walking would have been de riguer.

Anyway, tres cool!

esafak•5mo ago

These guys were not born when Crays roamed the earth.

balamatom•5mo ago

Their investor had one in the garage that they didn't know what to do with

ilaksh•5mo ago

Maybe they have combined an LLM or something with the speech detection convolution layers or whatever they were doing. Like with JSON schemas constraining the set of available tokens that are used for structured outputs. Except the set of tokens comes from the top 3-5 words that their first analysis/network decided are the most likely. So with that smarter system they can get by with fewer electrodes in a smaller area at the base of the skull where cranial nerves for the face and tongue emerge from the brainstem.

zknowledge•5mo ago

either this is the world's biggest grift OR the 2nd greatest product of the 21st century... so far.

Theodores•5mo ago

The presentation of this product reminds me of peak crypto when a 'white paper' and a two-page website was all anyone needed to get bamboozled into handing their money over.

Dilettante_•5mo ago

I want to believe so bad that I can finally get rid of my keyboard.

synapsomorphy•5mo ago

The accuracy is going to be the real make or break for this. In a paper from 2018 they reported 92% word accuracy [1]. That's a lifetime ago for ML but they were also using five facial electrodes where now it looks confined to around the ears. If the accuracy was great today they would report it. In actual use I can see even 99% being pretty annoying and 95% being almost unusable (for people who can speak normally).

[1] https://www.media.mit.edu/publications/alterego-IUI/

ivape•5mo ago

Why do you say that? I often vocalize near giberrish and the LLM fixes it for me and mostly gets what I meant.

thatxliner•5mo ago

Whisper-level transcription accuracy should be sufficient

blixt•5mo ago

I found it interesting that in the segment where two people were communicating "telepathically", they seem to be producing text, which is then put through text-to-speech (using what appeared to be a voice trained on their own -- nice touch).

I have to wonder, if they have enough signal to produce what essentially looks like speech-to-text (without the speech), wouldn't it be possible to use the exact same signal to directly produce the synthesized speech? It could also lower latency further by not needing extra surrounding context for the text to be pronounced correctly.

akdor1154•5mo ago

From memory, i think other recent research is along this approach, but not yet good enough. Cant remember where I read this but was likely HN. I think the posted paper got 95% accuracy when picking from a known set of target sentences/words, but far less (60%?) when used for freeform input.

I'm sure that's not the last word though!

stevage•5mo ago

Interesting, I remember reading a sci-fi book a long time ago with almost exactly this same method, which they called "sub-vocalisation".

(I think it was https://en.wikipedia.org/wiki/Oath_of_Fealty_%28novel%29 but can't find enough details to confirm.)

goopypoop•5mo ago

Speaker For The Dead - Orson Scott Card

com2kid•5mo ago

> they seem to be producing text, which is then put through text-to-speech (using what appeared to be a voice trained on their own -- nice touch).

This is an LLM model thing. Plenty of open source (or at least MIT licensed) LLMs and TTS models exist that translate and can be zero shot trained on a user's speech. Direct audio to audio models tend to be less researched and less advanced than the corresponding (but higher latency) audio to text to audio pipelines.

That said you can get audio->text->audio down to 400ms or so latency if you are really damn good at it.

stevage•5mo ago

The great thing about a product like this is that it's so easy to fake it in video.

I don't really buy that typing speed is a bottleneck for most people. We can't actually think all that fast. And I suspect AI is doing a lot of filling in the gaps here.

It might have some niche use cases, like being able to use your phone while cycling.

Bjartr•5mo ago

Personal anecdote: I do find typing to be a bottleneck in situations where typing speed is valuable (so notes in meetings, not when coding).

I can break 100wpm, especially if I accept typos. It's still much, much slower to type than I can think.

j45•5mo ago

Speech to text can be 130-200 wpm.

Also, keybr.com helps speed up typing if you were thinking about it.

robofanatic•5mo ago

> notes in meetings

That’s already solved by AI, if you let AI listen to your meetings.

Feathercrown•5mo ago

I haven't found that to be very accurate. I suspect the internal idiosyncrasies of a company are an issue, as the AI doesn't have the necessary context.

aeroaero•5mo ago

Seems like it would be much easier to solve that problem than it would be to cross the brain barrier and start interfacing with our thoughts, no? Just provide some context on the company etc

littlestymaar•5mo ago

“Sounds like it would” yes, but on practice no off the self solution works remotely well enough.

> Just provide some context on the company etc

The necessary “context” includes at least the name and pronunciation of the names of all workers of a company with a non English first name, so it's far from trivial.

comedowntous•5mo ago

> off the self

Was that deliberate, or a typo? I am genuinely wondering!

littlestymaar•5mo ago

It's a typo.

Bjartr•5mo ago

Not when I want my notes to contain my own thoughts or reminders to myself. That's only in my head and today I either have to miss out on what is being said next to type it out, speak up in that moment (even if not really on topic), or lose the thought entirely.

stevage•5mo ago

My experience with taking notes in meetings is definitely that the brain is the bottleneck, not the fingers. There are times where I literally just type what the person is saying, dictation style (ie, recording a client's exact words, often helpful for reference, even later in the meeting). I can usually keep up. But if I'm trying to formulate original thoughts, or synthesise what I've heard, or find a way to summarise what they have been saying - that's where I fall behind, even though the total number of words I need to write is actually much smaller.

So this definitely wouldn't help me here. Realistically though, there ought to be better solutions like something that just listens to the meeting and automatically takes notes.

nxobject•5mo ago

If you want something free, available right now, and dependent on only an IME, have you considered learning a stenotyping/chording keyboard layout?

https://www.openstenoproject.org/plover/

Bjartr•5mo ago

Yes, but having the discipline to practice enough to get faster than my normal typing speed has been an obstacle.

w00ds•5mo ago

It's possible the demo is faked, and I'm skeptical. But I also don't think the speed is really the point of a device like this. Getting out a device, pressing keys or tapping on it, and putting it away again, those attentional costs of using some device... I know something like basic notetaking would feel really different to me if I was able to just do the thing in the demo at high accuracy instead. That's a big if, though - the accuracy would have to be high for it to really be useful, and the video is probably best-case clips.

dllthomas•5mo ago

Typing speed is very much a bottleneck when I'm washing dishes, at least.

lennxa•5mo ago

talk then

dllthomas•5mo ago

Not ideal, between sound from the washing itself plus whatever else is going on in the house, and I'd rather not add noise that others have to deal with. That said, it's certainly in the mix as competing with other potential solutions.

prerok•5mo ago

How about when baking?

https://xkcd.com/341/

dllthomas•5mo ago

Then, too. I cannot compete with Mrs Roberts.

com2kid•5mo ago

Pulling out my phone, unlocking it, opening my notes app, creating a new note, that is a bottleneck.

Puling out my phone, unlocking it, remembering what the hotkey is today for starting google/gemini, is a bottle neck. Damned if I can remember what random gesture lets me ask Gemini to take a note today (presumably gemini has notes support now, IIRC the original release didn't).

Finding where Google stashes todo items at, also a bottle neck. Of course that entails me getting my phone out and navigating to whatever notes app (for awhile todos/notes were inside a separate Google search app!) they are shoved into.

My Palm Pilot from 2000 had more usability than a modern smartphone.

This device can solve all of those issues.

stevage•5mo ago

>This device can solve all of those issues.

If you're wearing it at the right moment.

noduerme•5mo ago

Plausible deniability is the ticket. I see the killer app here being communication with people in comas. Or corpses.

jonwinstanley•5mo ago

Or dogs

jimkleiber•5mo ago

Most people i think type very slowly on computers and i believe type even more slowly on phones. I've had many many people remark at how fast i type on both of those platforms and it still confuses me, as i think it's so easy for me to overlook how slowly people type.

az226•5mo ago

You’re also underestimating that some people think really quickly. Way faster than you can type.

jimkleiber•5mo ago

I also think i think very quickly and faster than i can type :-D

emsign•5mo ago

I fear that if AI filling the gapsis competing with your own thoughts while using it, your lazy brain goes the easy way and accepts the insinuations of themachine as its own. Beating the entire purpose of having a gizmo that writes down "your very own" thoughts.

That is if this exists. But if it does, it's not what you think it does for you.

CrociDB•5mo ago

Yeah, good point. Especially because most of the time our thoughts are a lot more fuzzy than we might think. Having it put words to it will feel like it's "augmenting" or "clarifying" our thoughts, but in reality it might be just controlling it.

abirch•5mo ago

I'm wondering if my Alterego usage will justify my ADHD meds.

garyfirestorm•5mo ago

in conversations people do finish your sentences (sometimes) which i personally find annoying if they especially finish them incorrectly and i have to take a pause and correct them - but it doesn't force me to accept their words as my thoughts...

az226•5mo ago

Typing is a big bottleneck. Speak for yourself

pj_mukh•5mo ago

I think the idea here is a very different mode of programming. Less prompting, waiting, seeing results, prompting again (where typing is not the bottleneck)

But more having a conversation with a really fast coding agent. That should feel like you’re micro-managing an intern as they code really fast, you could start describing the problem and it could start coding and you interject and tell it to do do things differently. There the bottleneck would be typing, especially if you have fast inference. But with voice now you’re coding at the speed of your thoughts.

I think doing that would be super cool but awkward if you’re talking out loud in an office, that’s where this device would come in.

oldfuture•5mo ago

learn more on: https://www.media.mit.edu/projects/alterego/overview/

also adding their press release here:

https://docsend.com/view/dmda8mqzhcvqrkrk/d/fjr4nnmzf9jnjzgw

throwawaymaths•5mo ago

> while cycling

depends on what they are connected to in the back there.

mickdarling•5mo ago

I agree that it's an easy to fake demo and at the same time, if they're going to fake it, why make it seem so slow?

As to whether typing speed is a bottleneck for most people, maybe not most people, but definitely some people, and it's a massive bottleneck for me personally.

I think better when I'm talking and since I have started using speech to text, it has increased my writing speed and coding speed by at least an order, maybe two orders of magnitude.

But you are right, the AI filling in gaps can really cause trouble using speech, goodness knows what it's doing using sub-speech.

stevage•5mo ago

>I agree that it's an easy to fake demo and at the same time, if they're going to fake it, why make it seem so slow?

Honestly I have no idea if it's fake. I wouldn't be surprised if it's both fake and real: the actual video is entirely fake, but a reasonably accurate demonstration of actual capabilities (like a lot of tech demos at live events...)

legacynl•5mo ago

> We can't actually think all that fast.

One of the major ways you can speed up reading, is that you stop 'vocalizing' each word in your head. It does seem that thinking is much faster than 'thinking aloud' (in your head)

lordofgibbons•5mo ago

Is the video hosted anywhere else? It seems to be the only source of info, but it's being played at like 0.5X speed.

oldfuture•5mo ago

more here:

https://www.media.mit.edu/projects/alterego/overview/

check also the publications tab, and this pr:

https://docsend.com/view/dmda8mqzhcvqrkrk/d/fjr4nnmzf9jnjzgw

tibbon•5mo ago

So... Sub-Etha?

bigolnik•5mo ago

So this a bone conducting microphone? That operates at the speed of speech? While you sit around awkwardly, hoping no one talks to you? This isn't thought. This is you saying to yourself quite clearly what you would like it to hear.

reassess_blind•5mo ago

It doesn't look like he's speaking.

socalgal2•5mo ago

I just imagine this going really wrong. My chain of thought would be something like: "Let's see, I need to rotate this image so I need to loop over rows then columns, .. gawd fuck this code base is shit designed, there are no units on these fields, this could be so much cleaner, ... for each row ... I wonder what's for lunch today? I hope it's good ... for each column ... Dang that response on HN really pissed me off, I'd better go check it ... read pixel from source ... tonight I meeting up with a friend, I'd better remember to confirm, ... write pixel to dest ...."

desireco42•5mo ago

What I picked up from this vision of the future... we will have mind reading devices to capture out thoughts, but we will still be on a train and commuting to work... dang...

So they came up with this groundbreaking idea but couldn't come up with better use case then typing on a train.

Look, I can't but not appreciate that at least they are doing something interesting as opposed to vibe one shot fork of vs code things that we see.

g42gregory•5mo ago

Are these are friends of Elizabeth Holmes? :-)

soulofmischief•5mo ago

> We currently have a working prototype that, after training with user-specific example data, demonstrates over 90% accuracy on an application-specific vocabulary. The system is currently user-dependent and requires individual training. We are currently on working on iterations that would not require any personalization.

https://www.media.mit.edu/projects/alterego/frequently-asked...

andymatuschak•5mo ago

That text was written about the Media Lab-era prototype in 2019: https://web.archive.org/web/20190102110930/https://www.media...

I wonder how far they've gotten past it.

oliwary•5mo ago

The key to technology like this is how quickly it can get to say 99.5%. Not convinced it will be an easy path!

djhn•5mo ago

Even speech to text has a long way to go to reach 99.5%!

olejorgenb•5mo ago

"AlterEgo reads information from the peripheral somatic system through internal speech movements, rather than directly from the brain. It detects the signals users send to their mouth and vocal cords when deliberately, but silently, voicing words. "

taneq•5mo ago

So it's a Thalmic Myo for your trachea?

andsoitis•5mo ago

They don’t have something that anyone can try out and it also seems no public demonstrations of early prototypes.

Seems like vaporware.

desireco42•5mo ago

Reminds of the song Sound of Silence...

Briannaj•5mo ago

This is literally only as fast as text to speech. the only difference is that you don't have to speak aloud. Which is cool. But for using a computer its still annoying and worse than a mouse because with a mouse you can click or drag and place in a second, in this format you have to think "move the box from point A to point B (with coordinates or a description) etc etc".

I think its cool, I've been brainstorming how a good MCI would work for a while and didn't think of this. I think its a great novel approach that will probably be expanded on soon.

Briannaj•5mo ago

what it could be really cool for is stuff like "open my house door", "Turn off the lights", "text so and so", "Start my car" Stuff we want to do without pulling out our phone that doesn't require a lot of detailed instruction.

stevage•5mo ago

I must be such a rarity around here, but if I could improve a hundred things about my life, none of those would make the list. Well, possibly the third one - more convenient ways to text people.

I guess I also kind of enjoy the physical sensations of putting a key in a lock, opening the door etc. Definitely don't want a digital-only existence.

com2kid•5mo ago

> But for using a computer its still annoying and worse than a mouse because with a mouse you can click or drag and place in a second, in this format you have to think "move the box from point A to point B (with coordinates or a description) etc etc".

You wouldn't use a regular WIMP[1] paradigm with this, that completely defeats the advantages you have. You don't need to have a giant window full of icons and other clickable/tappable UI elements, that becomes pointless now.

[1]https://en.wikipedia.org/wiki/WIMP_(computing)

gcanyon•5mo ago

For those thinking about speed: an average human talks anywhere from 120-240 words per minute. An average human who touch types is probably 1/3 to 1/2 as fast as that, while an average human on a phone probably types 1/5 as fast as that.

But for me speed isn't even the issue. I can dictate to Siri at near-regular-speech speeds -- and then spend another 200% of the time that took to fix what it got wrong. I have reasonable diction and enunciation, and speech to text is just that bad while walking down the street. If this is as accurate as they're showing, it would be worth it just for the accuracy.

keleftheriou•5mo ago

I agree, but I think LLM-based voice input is a lot better. I’m using OpenAI’s realtime API for my Apple Watch app, and it does wonders, even editing can be as simple as “add a heart emoji at the end”, and it just works.

https://x.com/keleftheriou/status/1963399069646426341

deckar01•5mo ago

You can tell it’s fake, because it’s hard wired and super low profile, yet isn’t covered in LEDs.

laurieg•5mo ago

I'm a huge smart speaker user. I have one in every room. But as soon as guests come over I stop using them. I would never use Siri etc in public.

Going from voice input to silent voice input is a huge step forward for UX.

keleftheriou•5mo ago

I get the sentiment, but can you elaborate on why that is the case for you?

laurieg•5mo ago

It is instantly awkward. If you are there with one other person and you start talking, they assume you are talking to them.

baroninthetrees•5mo ago

As someone with ADD and a lot of crosstalk in my "inner voice", I can't imagine this could make any sense of what I intending, let alone one thing. Definitely a lot of use cases if it isn't vaporware.

paulbjensen•5mo ago

I wonder if they've considered testing it with people who have locked-in syndrome or Motor-Neurone disease. It could be an amazing tool for them.

whymauri•5mo ago

The Harvard BIONICS lab is working on neuroprostheses for different forms of paralysis, like intestinal paralysis. They're great.

deadbabe•5mo ago

Great. I can imagine fucked up charlatans putting stuff like this on brain dead patients and convincing their family the person is still able to communicate with the help of AI.

goopypoop•5mo ago

"spirit box" don't need Al

com2kid•5mo ago

I am surprised no one here has noted that a device like this almost completely negates the need for literacy. That is huge. Right now people still need to interact with written words, both typing and reading. Realistically a quiet vocal based input device like this could have a UX built around it that does not require users to be literate at all.

jussaying2•5mo ago

Not to mention the support it brings for people with disabilities! (speech, hands/fingers)

aDyslecticCrow•5mo ago

Amazing for paralysis and other sevear physical disabilities. Similar tech is already widely researched for years.

But I'm sceptical about this specific company with the lack of technical details.

aDyslecticCrow•5mo ago

How convenient! Literacy has always been a thorn to efficient society, as books too easily spread dangerous heretical propaganda. Now we can directly filter the quality of information and increase cultural unification. /j

com2kid•5mo ago

That is my fear yeah, a continued dumbing down of society.

Literacy rates in the US are already garbage, this device may just make it worse. If people never have to read or write, why would they bother learning how?

giveita•5mo ago

Hey Google make a note to pack my hiking boots. Done.

com2kid•5mo ago

Hey Google stopped working reliably when the move to Gemini started. It either doesn't trigger, or it has fewer features. It sure as heck isn't accessible all the time whenever I am close to my phone anymore, and I'm on a Pixel 9.

Dilettante_•5mo ago

Having a voice narrate one thing at a time to you does not have the same informational bandwidth as written content, labeled buttons, etc on a page/screen. Not even close.

com2kid•5mo ago

Video has far less information density than a blog post. Heck 2 hour long podcasts have an information density of damn near zero. Yet those are the dominate form of online content consumption now.

I agree for productivity use cases it isn't suitable, but that is only important for information workers, which are not the largest segment of society. The fact is, Gen Z and Alpha are already having issues using the high information density desktop paradigm, and technology like this will only work to further erode the needed capabilities of the average citizen. Doesn't bode well for democracy and all that.

bromanko•5mo ago

I think the killer app is doing video calls in coffee shops without disturbing my neighbors.

vunderba•5mo ago

I can 100% guarantee that I lack the Mentat level of discipline necessary to prevent my true inner thoughts from leaking out in the middle of a conference. I'd also be the first to inadvertently summon the giant squid from Sphere.

dwa3592•5mo ago

Y’all are missing a few key points.

- There is a ML model which was trained on 31 hours of silently spoken text. That’s the training data. You still need to know the red fruit in front of you is called apple bc that’s what the model is trained on. So you must be literate to get this working.

- The accuracy in the paper is on a very small text type, numerals. As much as I could understand, they asked users to do mathematical operations and they checked the accuracy on that. Someone with a deeper understanding please correct me.

- Most of the video demo(honestly) is meh, once you have the text input for a LLM, you are limited to what the LLM can do. The real deal is the ml model that translates the neuromuscular signals to actual words. Those signals must be super noisy. So training a model with only 31 hours of data is a bit surprising and impressive. But the model would probably require calibration for each user’s silent voice, like say this sentence silently , “a quick brown fox jumped over the rope”. I think this will be cool.

- I really hope this tech works. I really really hope they don’t sell to big tech jerks like Meta. I really really really hope this tech removes screens from our lives(or at least a step in the right direction).

dwa3592•5mo ago

I should have started with- “Congratulations. Very cool tech if works”.

crooked-v•5mo ago

> So you must be literate to get this working.

Literacy is about written text, not spoken words. I think you've confused it with fluency.

deepanwadhwa•5mo ago

The apple example seems off, but literacy is still going to be a requirement for this kind of thing to work. First, you need to be able to understand the language(grammar, vocabulary etc) to communicate with this device. The guy in the demo is literally thinking in English, Mandarin. I'd actually argue that only highly literate people will be able to use this device really well. This device is not reading thoughts, it's inferring the silently spoken words.

crooked-v•5mo ago

You're still confusing literacy with fluency. Literacy is not going to be a requirement, because the technology doesn't require reading or writing, and being literate is literally defined as being able to read and write.

vunderba•5mo ago

From the article:

> Alterego only responds to intentional, silent speech.

What exactly do they mean by this? Some kind of equivalent to subvocalization [1]?

[1] https://en.wikipedia.org/wiki/Subvocalization

hyperadvanced•5mo ago

Oh god we’re about to have the “I don’t have an inner monologue” debate again, aren’t we?

balamatom•5mo ago

I got a whole inner panel discussion!

ipsum2•5mo ago

Yes. The paper the company is based on uses EMG (muscle movements) to convert into text.

dinfinity•5mo ago

If you look at his facial movements in the video it looks as if he is pretty actively using his facial muscles, 'trying' to speak while moving as little as possible (which would cause the clearest signals to be emitted).

If that is what is happening, to me it feels like harder work than just speaking (similar to how singing softly but accurately can be very hard work). It would still be pretty cool, but only practical in use cases where you have to be silent and only for short periods of usage.

boznz•5mo ago

Spent all last year writing a techno-thriller about mind-reading, I'm sure this is about as factual, and, of course nothing nefarious could possibly happen if this ever became real.

deekshith13•5mo ago

You probably thought about some nefarious stuff that could happen. Mind to share some interesting ones?

balamatom•5mo ago

Well, for starters, there's the one where social consensus decides to define whether a subvocalization is "intentional" by whether the interface responded to it.

wcrossbow•5mo ago

This is the stuff nightmares are made of. We already live in a you have nothing to hide society. Now imagine one where mega corps and the government have access to every thought you have. No worries, you got nothing to hide right? What would that do to our thought process and how we articulate our inner selfs? What do we allow ourselves to even think? At some point it will not even matter because we will have trained ourselves to suppress any deviant thought. I'd rather not keep on going because the ramifications of this technology make me truly sick in the stomach.

ipnon•5mo ago

In the Ghost in the Shell universe, I always thought the telepathic conversations from cyberbrain to cyberbrain was one of the most fantastic and least realistic predictions for the future. But I was clearly wrong. We already have rudimentary telepathy 10 years ahead of schedule!

p1dda•5mo ago

Brilliant idea to capture the neuromuscular signals and translate it to text!

kittikitti•5mo ago

So if I unintentionally think of a thought crime, is it still illegal? I wonder how governments will use this to "nudge" their citizens. I guess if you have nothing to hide, then there are no issues at all.

croes•5mo ago

Why did the note sync about the hiking boots lost the "I have to"?

moezd•5mo ago

Great, now if I find myself in a weird dialogue and murmur under my breath this company can store exactly what I called those people. They can also sell that data to whoever pays highest. Tremendous job you guys, as if ad industry wasn't annoying and intrusive as they are currently!

giveita•5mo ago

Very impressive but use case is narrow. You are on a train and don't have time to make a note, ok. But most of the time we are in places reasonably private where speech recognition is fine and would be convenient maybe more so learning to say without saying.

As a disability speech aid though maybe it would be amazing?

PMunch•5mo ago

Speech for the disabled would indeed be great, as long as the disability doesn't also affect the system which you use to "silent speak".

As for the privacy thing, I would say that I absolutely hate talking out loud to my devices. Just the idea of talking my ideas into a recorder in my own office where nobody can hear me feels very strange to me. But I love thinking through ideas and writing scripts for speeches or presentations in my mind, or to plan out some code or overall project. A device like this would allow me to do the internal monologue thing, then turn to "silent speak" them into this device to take notes which sounds great. And the form-factor doesn't look that dissimilar to a set a bone-conduction headsets which would be perfect for privacy-aware feedback while allowing you to take in your surroundings.

With this tech demo though it seems like the transmission rate is veeery slow, he sits still in his chair staring into the room and a short sentence is all that appears. Not exactly speed of thought..

And of course there is the cable running off to who knows what kind of computational resources.

The AI parts of this are less exciting to me, but as an input device I'm really on-board with the idea.

colinwilyb•5mo ago

There are times when I'm masked (ie: 3M respirator) and gloved at work for 10+ hours at a time. I wonder if speech recognition would be possible without breaking the face seal during silent speech. This could be beneficial for other types of work in hazardous environments.

(My current solution is to tear the fingertip off my offhand glove so I can unlock and use my device....)

runxel•5mo ago

I've seen tech like this on display at the IFA already 15 years ago. Forgot the name, but it was pretty hyped at that time. You could even demo it live. Sure, it was only to steer a sidescrolling video game character, but it worked great with as little training of half a minute or so.

Anyhow, Alterego just seems like another vaporware product, that will never enter or even begin to penetrate the overall market. But let's see!

kordlessagain•5mo ago

Makes sense there is so much ego on that page.

hm-nah•5mo ago

I’ve thought about a future where all audio is recorded (public, home, work, etc.). If this thing is real, it would allow comms in this dystopian vision. Boo

darepublic•5mo ago

You gotta use it to control your AR hud while out in public

Tiereven•5mo ago

Integrating AlterEgo with the next generation of AR glasses could be the next generation of technology after these electronic bricks we carry in our pockets. My biggest frustration with wake-word assistants is that voice is inherently a broadcast channel.

There’s endless comedy about the confusion on a bus when someone's talking into Bluetooth and their neighbor thinks they’re being addressed. Silent Sense + AR gets your eyes up and around you, fixes posture, frees your hands and keeps the guy next to you out of the conversation.

alexoberneyer•5mo ago

Nice, then I can finally vibe code in a crowded cafe or Coworking space

bitwize•5mo ago

Big if real. This is the kind of "AI assisted coding tool" I could find myself using, one where I can think code and commands into the machine. Make it open source and local, and I'm sold!

qingcharles•5mo ago

This video weirds me out because the presenter never looks at the camera. Odd directing choice.

Tiny C Compiler

You Are Here

SectorC: A C Compiler in 512 bytes

The F Word

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Speed up responses with fast mode

LLMs as the new high level language

Software factories and the agentic moment

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

GitBlack: Tracing America's Foundation

Stories from 25 Years of Software Development

FDA intends to take action against non-FDA-approved GLP-1 drugs

Show HN: A luma dependent chroma compression algorithm (image compression)

Al Lowe on model trains, funny deaths and working with Disney

First Proof

Vocal Guide – belt sing without killing yourself

I write games in C (yes, C) (2016)

Start all of your commands with a comma (2009)

Italy Railways Sabotaged

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Reinforcement Learning from Human Feedback

Selection rather than prediction

The AI boom is causing shortages everywhere else

72M Points of Interest

A Fresh Look at IBM 3270 Information Display System

Coding agents have replaced every framework I used

Unseen Footage of Atari Battlezone Arcade Cabinet Production

France's homegrown open source online office suite

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Tiny C Compiler

You Are Here

SectorC: A C Compiler in 512 bytes

The F Word

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Speed up responses with fast mode

LLMs as the new high level language

Software factories and the agentic moment

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

GitBlack: Tracing America's Foundation

Stories from 25 Years of Software Development

FDA intends to take action against non-FDA-approved GLP-1 drugs

Show HN: A luma dependent chroma compression algorithm (image compression)

Al Lowe on model trains, funny deaths and working with Disney

First Proof

Vocal Guide – belt sing without killing yourself

I write games in C (yes, C) (2016)

Start all of your commands with a comma (2009)

Italy Railways Sabotaged

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Reinforcement Learning from Human Feedback

Selection rather than prediction

The AI boom is causing shortages everywhere else

72M Points of Interest

A Fresh Look at IBM 3270 Information Display System

Coding agents have replaced every framework I used

Unseen Footage of Atari Battlezone Arcade Cabinet Production

France's homegrown open source online office suite

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Alterego: Thought to Text

Comments