Since I joined, we've gone from <1k hours to >10k hours, and I've been really excited by how much our whole setup has changed. I've been implementing lots of improvements to the whole data pipeline and the operations side. Now that we train lots of models on the data, the model results also inform how we collect data (e.g. we care a lot less about noise now that we have more data).
We're definitely still improving the whole system, but at this point, we've learned a lot that I wish someone had told us when we started, so we thought we'd share it in case any of you are doing human data collection. We're all also very curious to get any feedback from the community!
“the room seemed colder” -> “ there was a breeze even a gentle gust”
That said, the way to 10-20x data collection would be to open a couple other data collection centers outside SF, in high-population cities. Right now, there's a big advantage in just having the data collection totally in-house, because it's so much easier to debug/improve it because we're so small. But now we've mostly worked out the process, it should also be very straightforward for us to just replicate the entire ops/data pipeline in 3-4 parallel data collection centers.
A couple of questions: What's the relationship between the number of hours of neurodata you collect and the quality of your predictions? Does it help to get less data from more people, or more data from fewer people?
For a given amount of data, is it better to have more people with less data per person or fewer people with more data per person?
For a given amount of data, whether you want more or less data per person really depends on what you're trying to do. The thing we want is for it to be good at zero-shot, that is, for it to decode well on people who have zero hours in the train set. So for that, we want less data per person. If instead we wanted to make it do as well as possible on one individual, then we'd want way more data from that one person. (So, e.g., when we make it into a product at first, we'll probably finetune on each user for a while)
I wonder if there will be medical applications for this tech, for example identifying people with brain or neurological disorders based on how different their "neural imaging" looks from normal.
If you mean the text quality scoring system, then when we added that, it improved the amount of text we got per hour of neural data by between 30-35%. (That includes the fact that we filter which participants we have return based on their text quality scores)
[see https://news.ycombinator.com/item?id=45988611 for explanation]
Though, I suppose if the model had LLM-like context where it kept track of brain data and speech/typing from earlier in the conversation then it could perform in-context learning to adapt to the user.
We only got any generalization to new users after we had >500 individuals in the dataset, fwiw. There's some interesting MRI studies also finding a similar thing that when you have enough individuals in the dataset, you start seeing generalization.
Have you played at all with thought-to-voice? Intuitively I’d think EEG readout would be more reliable for spoken rather than typed words, especially if you’re not controlling for keyboard fluency.
ArjunPanicksser•2h ago
n7ck•2h ago
We tried google/facebook/instagram ads, and we tried paying for some video placements. Basically none of the explicit advertisement worked at all and it wasn't worth the money. Though for what it's worth, none of us are experts in advertising, so we might have been going about it wrong -- we didn't put loads of effort into iterating once we realized it wasn't working.