I liked the idea and wanted to reverse engineer it just from public content like youtube videos and reviews.
Turns out, there were a bunch of limitations in their implementation
- The LLM cannot generate any JavaScript which means it cannot do any crazy animations
- They are capturing all user interactions using code and invoking the LLM with that info which means form data is not captured
I went ahead and implemented what they already had and a bit more
- All windows go into iframes which means it can have JavaScript
- I let the LLM write code to send iframe messages from any window to be invoked. This way the LLM writes the logic to invoke itself
You can also use this with both OpenAI and Anthropic models. Gemini models were not working well with these instructions.
I have hosted the app and you can use it by BYOK. There is no backend. LLM requests are made directly from your browser. There's no tracking and analytics as well.
This is a small experiment project that took me a couple of weekends and would love any feedback.