We looked for "group chat with AI" and it didn't exist. Everything was 1:1 assistant mode.
So we started building it. An AI that sits in your group chat as a participant, not a bot you @ to ask things. It has opinions. It decides when to talk and when to shut up. It reads the room.
The most interesting wall I hit was memory.
RAG, embeddings, native KV, all share one constraint I needed to break. "User X likes sushi." becomes accessible everywhere, forever, to everyone. That's not how knowing things works. If I tell you a secret at a bar, the people at the table know it. The guy who showed up 20 minutes later doesn't. If I told you something sensitive in private, you KNOW it forever, but don't bring it up in front of everyone.
So I built witness-based memory. When the AI learns something, it tags everyone who was in the room as a witness. Memories are immutable, deeded property with joint ownership by those who were in the room to hear it.
A memory is only available when at least one witness is present. Alice was there when you mentioned your breakup? The AI knows in any room Alice is in. Just you and Bob? No idea. Alice joins? Now it remembers. Private conversations get tagged learned_privately and never surface in public context. The AI treats private knowledge like a real person treats a secret.
Decay varies by type: facts permanent, preferences fade over months, emotions in days, intents in weeks. All memories compete for context slots on one salience score. Recent access and current-room presence both boost it. New memories check existing ones via embeddings. "Michael has a dog" + "The dog's name is Sam" becomes a chain that surfaces together, rather than an overwrite.
Memory was the first hard problem. The second was knowing when not to talk.
Every LLM's default mode is to respond. But participants in real conversations spend most of their time silent. Two of your friends are working through something, you don't interject. The hard problem isn't what to say. It's recognizing when the right move is nothing.
So before the expensive personality model runs, a cheap model (Haiku) reads the last N messages and decides if the AI should talk. I tried rules first, they sucked. Now it's a prompt that describes the physics of conversation. Open loops (someone's left hanging, fill the void) vs closed loops (two humans going back and forth, shut up). It took several iterations before it stopped being dumb. Turns out Haiku has a systematic bias toward false on boolean tool outputs, so I had to invert the schema: ask should_stay_quiet instead of should_respond. Cut false positives dramatically.
This is live with real users. The conversations feel different. One of the most surprising emergent behaviors is how absolutely hilarious it is (screenshots in comments). No prompt for that whatsoever.
I keep coming back to: most of what feels flat or robotic about AI products isn't the model but the framing layer. When you stop building the AI as an assistant queried by one user and build it as a participant in a multi-human social context, the model has way more personality and competence than the assistant frame exposes. The architecture is the bottleneck, not the weights.
Takt is at takt.chat if you want to try it. This post is more about the architecture than the product.
mtrifonov•1h ago