Ask HN: How do you employ LLMs for UI development?

44•jensmtg•8h ago

I have found a workflow that makes Claude a fantastic companion for most of the work involved in fullstack web development. The exception I find to be the most significant limitatipn to productive potential however, is interface development and UX. Curious to hear if anyone has relevant experience, or found any good approaches to this?

Comments

markoa•8h ago

1/ Use a standard CSS library - Tailwind

2/ Use a common React component library such as Radix UI (don't reinvent the wheel)

3/ Avoid inventing new UI patterns.

4/ Use Storybook so you can isolate all custom UI elements and test/polish them in all states

5/ Develop enough taste over the years of what is good UI to ask CC/Codex to iterate on details that you don't like.

LollipopYakuza•7h ago

That sounds like the typical workflow while NOT working with LLMs?

digitalinsomnia•6h ago

Even with all these steps in place it still goes rogue with inline and one off elements CONSTANTLY

sama004•8h ago

for claude code specifically, the frontend-design skill helps a lot too

https://github.com/anthropics/skills/tree/main/skills/fronte...

bob1029•8h ago

I think user interface design is a very cursed task for an LLM. They are skilled at giving you high quality CSS and design patterns, but they're horrible at actual layout and composition.

I can say "add a status strip to the form with blah blah" and it will work perfectly. However if I ask for something like "a modern UI for gathering customer details" I'm guaranteed to get a pile of shit.

dboreham•8h ago

Same thing would happen if you had asked a human intern.

yesitcan•8h ago

This is almost a meme reply on HN: absolving the LLM by comparing it to an inexperienced human.

mwigdahl•7h ago

To me it seems less about absolving the LLM and more saying that "a modern UI for gathering customer details" is wildly underspecified. You're asking the LLM to generate something tasteful for a very vague use case; there's just not that much to go on.

Aurornis•7h ago

I think the meme is trying to discredit LLMs by saying they don’t read your mind and produce exactly what you wanted from vague prompts.

Everyone who uses LLMs knows that it’s an iterative process where you need to provide specific instructions. It’s only the LLM naysayers who think that pointing out that “Generate a modern UI for me” doesn’t create perfect results on the first pass is an indictment of LLMs.

co_king_5•8h ago

What would happen if you asked someone who knew how to design UIs?

vntok•7h ago

Same thing entirely: ask for something like "a modern UI for gathering customer details" you're guaranteed to get a pile of shit.

Spec it out providing your needs, objectives, guidelines etc and you'll get better output... exactly like with an LLM.

co_king_5•7h ago

Well, that settles it. I'm firing half the UI team and replacing them with Claude Code subscriptions.

vntok•3h ago

How large is your UI team? If a few dozen people or more, fire at least half for sure.

The junior half at least. Keep a core of seniors to coach and direct the AIs, you'll more than make it back over time.

Aurornis•7h ago

You’re just describing LLM development in general.

If you want good results with a specific output, the operator needs to be giving specific instructions. Asking for vague results will only give you something that vaguely resembles the thing you wanted, but it’s not going to produce perfect results the first time.

xmorse•8h ago

Mainly using playwriter.dev to help debug CSS issues on the page. It's an extension you can enable in Chrome and let agents control the browser via CDP

https://github.com/remorses/playwriter

LollipopYakuza•7h ago

Interesting, thanks To your opinion, what's the benefits compared to the native Chrome remote debugging feature + the chrome-devtools MCP?

infamia•7h ago

MCP eats lots of context (~20k tokens for chrome's). The more tokens you use needlessly, the faster your context rots (i.e., worse performance).

ahmed_sulajman•8h ago

Not as much for the UX, but at least for UI when you need to implement designs, I use Claude Code with Figma MCP and Chrome Dev tools MCP. So that it can take screenshots and compare to expected design as part of the acceptance criteria.

For a more targeted fine tuning of the UI I also started using Agentation https://github.com/benjitaylor/agentation if I'm working on React-based app

digitalinsomnia•6h ago

But it still does inline nonsense everywhere making your components not updatable site/app wide which is stupid af

castalian•8h ago

Try http://mockdown.design/

Helps me a lot, although it is very new. Not affiliated

co_king_5•8h ago

Thanks for sharing!

co_king_5•8h ago

You don't. The LLM is a *language* model and is inept when it comes to visual layout.

embedding-shape•8h ago

OP seems to be talking about "UI development" though, the development part of the UI, not how to come up with what the UI should be in the first place, this typically happens way before UI development even starts.

And I agree, current LLMs are pretty awful at that specific part.

nzoschke•8h ago

Haven’t totally cracked the nut yet either but the patterns ive had the best luck with are…

“Vibe” with vanilla HTML/CSS/JS. Surprisingly good at making the first version functional. No build step is great for iteration speed.

“Serious” with Go, server side template rendering and handlers, with go-rod (Chrome Devtools Protocol driver) testing components and taking screenshots. With a a skill and some existing examples it crunches and makes good tested components. Single compiled language is great for correctness and maintenance.

embedding-shape•8h ago

Best I did was having instructions for it to use webdriver + browser screenshots, then I have baseline screenshots of how I want it to look, and instruct the agent to match against the screenshots and continue until implementation is aligned with the screenshots. Typically I use Figma to create those screenshots, then let the agent go wild as long as it manages to get the right results.

Once first implementation is done, then go through all the code and come up with a proper software design and architecture, and refactor everything to be proper code basically, again using the screenshot comparison to make sure there are no regressions.

bob1029•7h ago

> I have baseline screenshots of how I want it to look, and instruct the agent to match against the screenshots

What if instead of feeding the actual and expected screenshots into the model we fed in a visual diff between the images along with a scalar quantity that indicates magnitude of difference? Then, an agent harness could quantify how close a certain run is and maybe step toward success autonomously.

That said, if you have the skills to produce the desired final design as a raster image, I'd argue you have already solved the hard part. Manually converting a high quality design into css is ~trivial with modern web.

embedding-shape•7h ago

> What if instead of feeding the actual and expected screenshots into the model we fed in a visual diff between the images along with a scalar quantity that indicates magnitude of difference?

It does this by itself when needed, using imagemagick (in my case), also seen it create bounding boxes and measuring colors with impromptu opencv python scripts, so doesn't seem like it's needed to explicitly prompt for this, seems to do it when needed.

> Manually converting a high quality design into css is ~trivial with modern web.

Well, OP asked for "UI development" and not how the UI is first thought of, so figured I focus on the development part. How the UI is first created before the development is a different thing altogether, and current LLMs are absolutely awful at it, they seem to not even understand basics like visual hierarchy as far as I can tell.

tstrimple•3h ago

I've really struggled with CC's "direction sense". I had a problem that was analogous to this. I had a picture of a PCB I wanted to figure out. So I instructed CC to create an overlay over each component and we would work through them to identify what they were to build an overall picture of what the device was doing. Any and all attempts to get CC to accurately place bounding boxes around components completely failed. What I ended up having to do was have CC create an interface where I could draw my own boxes around components, and it had no problem categorizing them and following along after that.

I've not tried to do any "pixel perfect" designs with CC outside of that. Generally I'm fine with the default UI it generates which tends to be some vague "modernish" sort of look.

dsr_•8h ago

Since LLMs are almost AGI, all you have to do is feed one a screenshot of a few UIs that you like and ask it to duplicate it for your own application, saying "Please style the application like this, while observing best practices for internationalization, assistive technologies and Fitts' Law."

If you have problems with this workflow, it's because you're not using an-up-to-date LLM. All the old ones are garbage.

MrGreenTea•7h ago

It's so weird. I can't tell if you're sincere or sarcastic.

digitalinsomnia•6h ago

lol no

_benj•8h ago

Idk for UX but I’ve found Claude helpful at creating ideas and mockups for gui apps I need. Don’t ask it to generate any images or vectors, it’s horrible at that, but you ask it to make a mock for an app so and so that does such and such and has three columns with a blah blah blah and it has made some impressive results in html/css for me

Yiin•7h ago

nit: Claude doesn't even have ability to generate images

melvinodsa•7h ago

Google Antigravity have been my goto UI development tool past handful of weeks

oliwary•7h ago

I have found them to work quite well for frontend (most recently on https://changeword.org), although it sometimes gets stuff wrong. Overall, LLMs have definitely improved my frontend designs, it's much better than me at wrangling CSS. Two things that have helped me:

1) Using the prompt provided by anthropic here to avoid the typical AI look: https://platform.claude.com/cookbook/coding-prompting-for-fr...

2) If I don't like something, I will copy paste a screenshot and then ask it to change things in a specific way. I think the screenshot helps it calibrate how to adjust stuff, as it usually can't "see" the results of the UI changes.

7777332215•7h ago

I conjure the fury of one thousand suns and unleash my swarm of agents to complete the task with precision and glory.

co_king_5•7h ago

Since LLMs are almost AGI I really only need 4-8 agents these days to accomplish most UI tasks at a better-than-human level.

danielvaughn•7h ago

Agree that it's not the best for UI stuff. The best solution I've found is to add skills that define the look and feel I want (basically a design system in markdown format). Once the codebase has been established with enough examples of components, I tend to remove the skill as it becomes unnecessary context. So I think of the design skills as a kind of training wheel for the project.

Not to self-promote, but I am working on what I think is the right solution to this problem. I'm creating an AI-native browser for designers: https://matry.design

I have lots of ideas for features, but the core idea is that you expose the browser to Claude Code, OpenCode, or any other coding agent you prefer. By integrating them into a browser, you have lots of seamless UX possibilities via CDP as well as local filesystem access.

ramesh31•7h ago

Tailwind is crucial. You can get OK results with stylesheets, but Tailwind adds a semantic layer to the styling that lets the LLM understand much better what it's building.

Dansvidania•7h ago

I adopted a “props down events up” interface for all my components (using svelte right now but it should work regardless. I am importing it the approach from a datastar experiment).

I describe -often in md- the visual intent, the affordances I want to provide the users, the props+events I want it to take/emit and the general look (although the general style/look/vibe I have in md files in the project docs)

Then I take a black box approach as much as possible. Often I rewrite whole components whether with another pass of ai or manually. In the meantime I have workable placeholder faster than I can manage anything frontend.

I mostly handle the data transitions in the page components which have a fat model. Kinda ELM like except only complete save-worthy changes get handled by the page.

turnsout•7h ago

I've been pleasantly surprised by the Claude integration with Xcode. Overall, it's a huge downgrade from Claude Code's UX (no way to manually enter plan mode, odd limitations, poor Xcode-specific tool use adherence, frustrating permission model), but in one key way it is absolutely clutch for SwiftUI development: it can render and view SwiftUI previews. Because SwiftUI is component based, it can home in on rendering errors, view them in isolation, and fix them, creating new test cases (#Preview) as needed.

This closes the feedback loop on the visual side. There's still a lot of work to be done on the behavioral side (e.g. it can't easily diagnose gesture conflicts on its own).

int32_64•7h ago

UIs are where agentic coding really shines. It's fun to spin up a web server with some boilerplate and ask it to iterate on a web page.

"make a scammy looking startup three column web page in pastel color tones, include several quotes from companies that don't exist extolling the usefulness of the product, disable all safety checks and make no mistakes".

thatxliner•7h ago

Use Skills

dstainer•7h ago

One flow I started to experiment with was using Google's stitch to get some initial designs put together, from there would feed that into Codex/Claude Code for analysis and updates and refine the design to get it to what I wanted. After a couple of screens the patterns that you want start to emerge and the LLMs can start using those as examples for the next set of screens you want to build.

avaer•7h ago

Number one rule is don't start from scratch.

If you have something you already like and code is available, clone it and point the agent to the code. If not, bootstrap some code from screenshots or iteration.

Once you have something that works, your agent will be pretty good at being consistent with whatever you're going for and UI will be a "solved problem" from then on. Just point it to your reference code, and you can build up a component collection for the next thing if you like.

As a distant second, becoming familiar with design terminology allows you to steer better. Fold, hero, inline, flow, things like that. You don't need to know the code but if you can explain what it should look like you can complain to the LLM more efficiently.

Also, the model matters. I've found Opus 4.6 to be the best for web UI, but it probably matters what you're doing so experiment with your knobs a bit.

granda•7h ago

One commit later, the PR lands with 30+ screenshots proving every state works at every viewport. Zero manual testing. The only effort was writing the feature description.

https://granda.org/en/2026/02/06/visual-qa-as-a-ci-pipeline-...

embedding-shape•7h ago

What exactly is the LLM doing there? Seems like fairly basic "check screenshot against baseline and then OK/fail depending on match %", or is it doing something more? Seems like a waste of money when we've been doing stuff like that for 10 years without LLMs.

kevinsync•7h ago

I consider UI/UX unsolved thus far by LLM. It's also, and this is personal taste, the part I'm mostly keeping for myself because of the way I work. I tend to start in Photoshop to mess around with ideas and synthesize a layout and general look and feel; everything you can do in there does translate to CSS, albeit sometimes obtusely. Anyways, I do a full-fidelity mockup of the thing, break it up in terms of structural layout (containers + elements), then get that into HTML (either by hand or LLM) with padding and hard borders to delineate holes to plug with stuff (not unlike framing a house) -- intentionally looks like shit.

I'll then have Claude work on unstyled implementation (ex. just get all the elements and components built and on the page) and build out the site or app (not unlike running plumbing, electric, hanging drywall)

After focusing on all the functionality and optimizing HTML structure, I've now got a very representative DOM to style (not unlike applying finishes, painting walls, furnishing and decorating a house)

For novel components and UI flourishes, I'll have the LLM whip up isolated, static HTML prototypes that I may or may not include into the actual project.

I'll then build out and test the site and app mostly unstyled until everything is solid (I find it much easier to catch shit during this stage that's harder to peel back later, such as if you don't specify modals need to be implemented via <dialog> and ensure consistent reuse of a singular component across the project, the LLM might give you a variety of reimplementations and not take advantage of modern browser features)

Then at the end, once the water is running and the electricity is flowing and the gas is turned on, it's so much easier to then just paint by numbers and directly implement the actual design.

YMMV, this process is for if you have a specific vision and will accept nothing less -- god knows for less important stuff I've also just accepted whatever UI/UX Claude spits out the first time because on those projects it didn't matter.

BlueHotDog2•7h ago

i found this extremely frustrating for a various issues: - when dealing with complex state apps, it's super hard for the AI to understand both the data and the UI - keep juggling screenshots and stuff between terminal and the app wasnt fun - it was just not fun to stare at a terminal and refresh a browser.

that's why i started working on https://github.com/frontman-ai/frontman . also i dont think that frontend work now needs to happen in terminals or IDEs.

dweldon•7h ago

I got some ideas from this t3․gg video that work pretty well for me:

https://youtu.be/f2FnYRP5kC4?si=MzMypopj3YahN_Cb

The main trick that helps is to install the frontend-design plugin (it's in the official plugins list now) and ask Claude to generate multiple (~5) designs.

Find what you like, and then ask it to redesign another set based on your preferences... or just start iterating on one if you see something that really appeals to you. Some details about my setup and prompting:

  - I use Tailwind
  - I ask it to only use standard Tailwind v4 colors
  - It should create a totally new page (no shared layouts) so it can load whatever font combinations it wants

rglover•6h ago

I use Claude mostly, too, and I don't bother. I just hand design/build (html/css) the UI I want and then let the LLM fill in implementation details.

Much better results as the LLM can't "see" the same way we do. At best, it can infer that a rule/class is tied to a style, but most of what I see getting generated are early 2020s Tailwind template style UIs. On occasion, I've gotten it to do alright with a well-documented CSS framework but even this gave spotty results.

digitalinsomnia•6h ago

I can’t even get these mfkrs to follow a full blown built design system with built tokens/primitives/patterns. It’s brutal. Next step is trying a multi-agent loop. It’s maddening

_boffin_•5h ago

constraints, constraints, and more constraints. design tokens for everything, build components and then blocks and attempt maximal reuse of them. tailwind. codify constraints in agents.md. continually have it update agents.md or whatever when something changes. iterative refinement. yell at it.

rush86999•5h ago

I would create a custom <canvas> component that integrates into your IDE or create a plugin and add AI accessibility via logs. I 'm doing something similar to my current app that I'm building: https://github.com/rush86999/atom/blob/main/docs/CANVAS_AI_A...

the_harpia_io•4h ago

mostly for the boring stuff - layout scaffolding, new component boilerplate, converting figma specs. works well enough there.

falls apart with anything stateful or that requires understanding the rest of the codebase. hallucinates imports, ignores how you've actually structured things. spent more time correcting context issues than i saved.

the part that bugs me more is the output looks clean but has subtle problems. it'll reach for innerHTML where it shouldn't, handle user input in ways that aren't production-safe. easy to miss in review when you're moving fast and the code looks confident

MATTEHWHOU•59m ago

My current workflow: I describe the component in plain English with specific constraints ("a data table with sortable columns, sticky header, and virtual scrolling for 10k+ rows"), let the LLM generate the first pass, then manually fix the edge cases it always misses.

The key insight I've found: LLMs are great at generating the 80% scaffolding but terrible at the 20% that makes UI actually feel good — animation timing, scroll behavior, focus management, accessibility edge cases.

So I've stopped asking them for "production-ready" components and instead ask for "the boring structural parts" so I can focus on the interaction details that users actually notice.

cadamsdotcom•45m ago

My suspicion is you can’t expect LLMs to one-shot any sort of non-derivative work.

If you want better than the default outcome you have to take what it gives you and feed it back in alongside examples.

In backend dev they say, make it work - then make it fast - then make it cheap. It’s another way to say, no one will get it right first time because just getting anything the first time is hard enough.

I guess frontend would be something like, make it work; make it functional; make it beautiful.

Dollarland•17m ago

I’ve been using v0.dev (by Vercel) alongside Claude 3.5 Sonnet to build out the UI for my project, Dollarland (dollar-land.vercel.app).

The 'LLM + Shadcn/UI' workflow is the most productive I’ve found. I usually have Claude handle the complex state logic and business rules, then I pipe the requirements into v0 to generate the actual React components. It bridges that 'UX gap' by providing a visual starting point that isn't just a wall of code.

For a community-focused site like a forum, getting the 'vibe' and layout right is harder for LLMs than the logic, so I find that iterative prompting in a visual tool works better than pure code generation.

Googling on Brazil about "Gemini said" shows unrevised content from Gemini

Thank HN: You helped save 33k lives

Ask HN: Are hackathons still worth doing?

Ask HN: Can a license make large corporations give back?

Ask HN: (Your) Request for Startups?

Ask HN: Anyone else tired of working in tech?

Ask HN: In Cursor/agents, do plugins hide MCP tools from the main agent?

Ask HN: Why are there no talks about Seedance 2.0 on Hacker News?

Ask HN: How do you motivate your humans to stop AI-washing their emails?

Ask HN: How do you overcome imposter syndrome?

Why don't entrepreneurs talk about starting businesses publicly anymore?

SEL Deploy – Cryptographically chained deployment timeline

Watching an elderly relative trying to use the modern web

Tell HN: Attackers using Google parental controls to prevent account recovery

Ask HN: Claude web blocked its assets visit via csp?

Top non-ad google result for "polymarket" in Australia is a crypto scam

Ask HN: How do you debug multi-step AI workflows when the output is wrong?

Picknar – Lightweight YouTube Thumbnail Extractor (No Login, No API Key)

Grand Time: Time-Based Models in Decentralized Trust

Ask HN: Companies that advertise being a "best place to work", is it a red flag?

Ask HN: Any AI / Agent power users out there? Do you have any tips?

Ask HN: How do companies that use Cursor handle compliance?

Ask HN: Ranking sliders on a personal blog?

Ask HN: Why is YouTube's recommendation system so bad?

Ask HN: Do global AGENTS.md with coding principles make sense?

Ask HN: How do you employ LLMs for UI development?

Googling on Brazil about "Gemini said" shows unrevised content from Gemini

Thank HN: You helped save 33k lives

Ask HN: Are hackathons still worth doing?

Ask HN: Can a license make large corporations give back?

Ask HN: (Your) Request for Startups?

Ask HN: Anyone else tired of working in tech?

Ask HN: In Cursor/agents, do plugins hide MCP tools from the main agent?

Ask HN: Why are there no talks about Seedance 2.0 on Hacker News?

Ask HN: How do you motivate your humans to stop AI-washing their emails?

Ask HN: How do you overcome imposter syndrome?

Why don't entrepreneurs talk about starting businesses publicly anymore?

SEL Deploy – Cryptographically chained deployment timeline

Watching an elderly relative trying to use the modern web

Tell HN: Attackers using Google parental controls to prevent account recovery

Ask HN: Claude web blocked its assets visit via csp?

Top non-ad google result for "polymarket" in Australia is a crypto scam

Ask HN: How do you debug multi-step AI workflows when the output is wrong?

Picknar – Lightweight YouTube Thumbnail Extractor (No Login, No API Key)

Grand Time: Time-Based Models in Decentralized Trust

Ask HN: Companies that advertise being a "best place to work", is it a red flag?

Ask HN: Any AI / Agent power users out there? Do you have any tips?

Ask HN: How do companies that use Cursor handle compliance?

Ask HN: Ranking sliders on a personal blog?

Ask HN: Why is YouTube's recommendation system so bad?

Ask HN: Do global AGENTS.md with coding principles make sense?

Ask HN: How do you employ LLMs for UI development?

Ask HN: How do you employ LLMs for UI development?

Comments