After your initial question, it always follows up asking some clarifying questions, but it’s completely up to the user to format their responses and I always wonder if people are sloppy if the LLM gets confused. It would make much more sense for OpenAI to break out each question and have a dedicated answer box. That way the user’s response can be consistent and there’s less of a chance they make a mistake or forget to answer a question.
AI or not won't change these sorts of UI too much.
When you want to order a pizza, you won't have to click. Just browse and ask the AI assistant to place an order as you would in a restaurant. Better UX.
Isn't on demand what chat llms already do nowadays btw?
point being that generating visual UI components is easy. chatgpt does it. server driven UI does it.
But multimodal interaction is something else that goes further.
You might say naming the color is enough, but in reality, a color picker is the more natural way to interact.
As humans, we don’t communicate only through words. Other forms of interaction matter too.
It will probably have access to a list of components with their specifications, especially the type of data that the components allow to mutably (or not) represent.
Or respond to a query from a database by presenting a graph automatically.
But the hard part is to turn natural language into a sql query in my opinion. It's not really the choice of data representation which is heavily informed by the data itself (type and value) and doesn't require much inference.
Conversations are error prone and noisy.
UI distills down the mode of interaction into something defined and well understood by both parties.
Humans have been able to speak to each other for a long time, but we fill out forms for anything formal.
I thought you'd say not being able to reload the form at a later time from the same URL is bad. This would be a "quantum UI" slightly different every time you load it.
If you look at many of the current innovations around working with llms and agents, they are largely around constraining and tracking context in a structured way. There will likely be emergent patterns for these sorts of things over time, I am implementing my own approach for now with hopefully good abstractions to allow future portability.
For sure! UIs are also most of the past and present way to interact with a computer, off or online. Even Hacker News - which is mostly text - has some UI for to vote, navigate, flag…
Imagine the mess of a text-field-only interface where you had to type "upvote the upper ActionHank message" or "open the third article’ comments on the front page, the one that talks about On-demand UI generation…" then press enter.
Don’t get me wrong: LLMs are great and it’s fascinating to see experimentations with them. Kudos to the author.
I'd love to see folks finding the same sort of energy and innovation which was driving early projects such as Momenta and PenPoint and so forth.
We found it lowered barriers to providing context to AI, improved user perception of control over AI, and provided users guidance for steering AI interactions.
Shipping forms usually need verification of addresses, sometimes they even include a map
Especially if on the other end data that would be inputted in this form, would be stored in the traditional DB
Much better usecase would be use it in something, that is dynamic by nature. For example, advanced prompt generator for image generation models (sliders for size of objects in a scene; dropdown menus with variants of backgrounds or style, instead of usual lists)
I don't know why you wouldn't develop whatever forms you wanted to support upfront and make them available to the agent (and hopefully provide old-fashioned search). You can still use AI to develop and maintain the forms. Since the output can be used as many times as you want, you can probably use more expensive/capable models to develop the forms rather than cheaper/faster but less capable models that you're probably limited to for customer service.
exe34•8h ago
fka•7h ago