Ask HN: How do you employ LLMs for UI development?

39•jensmtg•2h ago

I have found a workflow that makes Claude a fantastic companion for most of the work involved in fullstack web development. The exception I find to be the most significant limitatipn to productive potential however, is interface development and UX. Curious to hear if anyone has relevant experience, or found any good approaches to this?

Comments

markoa•1h ago

1/ Use a standard CSS library - Tailwind

2/ Use a common React component library such as Radix UI (don't reinvent the wheel)

3/ Avoid inventing new UI patterns.

4/ Use Storybook so you can isolate all custom UI elements and test/polish them in all states

5/ Develop enough taste over the years of what is good UI to ask CC/Codex to iterate on details that you don't like.

LollipopYakuza•1h ago

That sounds like the typical workflow while NOT working with LLMs?

sama004•1h ago

for claude code specifically, the frontend-design skill helps a lot too

https://github.com/anthropics/skills/tree/main/skills/fronte...

bob1029•1h ago

I think user interface design is a very cursed task for an LLM. They are skilled at giving you high quality CSS and design patterns, but they're horrible at actual layout and composition.

I can say "add a status strip to the form with blah blah" and it will work perfectly. However if I ask for something like "a modern UI for gathering customer details" I'm guaranteed to get a pile of shit.

dboreham•1h ago

Same thing would happen if you had asked a human intern.

yesitcan•1h ago

This is almost a meme reply on HN: absolving the LLM by comparing it to an inexperienced human.

mwigdahl•1h ago

To me it seems less about absolving the LLM and more saying that "a modern UI for gathering customer details" is wildly underspecified. You're asking the LLM to generate something tasteful for a very vague use case; there's just not that much to go on.

Aurornis•1h ago

I think the meme is trying to discredit LLMs by saying they don’t read your mind and produce exactly what you wanted from vague prompts.

Everyone who uses LLMs knows that it’s an iterative process where you need to provide specific instructions. It’s only the LLM naysayers who think that pointing out that “Generate a modern UI for me” doesn’t create perfect results on the first pass is an indictment of LLMs.

co_king_5•1h ago

What would happen if you asked someone who knew how to design UIs?

vntok•1h ago

Same thing entirely: ask for something like "a modern UI for gathering customer details" you're guaranteed to get a pile of shit.

Spec it out providing your needs, objectives, guidelines etc and you'll get better output... exactly like with an LLM.

co_king_5•56m ago

Well, that settles it. I'm firing half the UI team and replacing them with Claude Code subscriptions.

Aurornis•1h ago

You’re just describing LLM development in general.

If you want good results with a specific output, the operator needs to be giving specific instructions. Asking for vague results will only give you something that vaguely resembles the thing you wanted, but it’s not going to produce perfect results the first time.

xmorse•1h ago

Mainly using playwriter.dev to help debug CSS issues on the page. It's an extension you can enable in Chrome and let agents control the browser via CDP

https://github.com/remorses/playwriter

LollipopYakuza•1h ago

Interesting, thanks To your opinion, what's the benefits compared to the native Chrome remote debugging feature + the chrome-devtools MCP?

infamia•56m ago

MCP eats lots of context (~20k tokens for chrome's). The more tokens you use needlessly, the faster your context rots (i.e., worse performance).

ahmed_sulajman•1h ago

Not as much for the UX, but at least for UI when you need to implement designs, I use Claude Code with Figma MCP and Chrome Dev tools MCP. So that it can take screenshots and compare to expected design as part of the acceptance criteria.

For a more targeted fine tuning of the UI I also started using Agentation https://github.com/benjitaylor/agentation if I'm working on React-based app

castalian•1h ago

Try http://mockdown.design/

Helps me a lot, although it is very new. Not affiliated

co_king_5•1h ago

Thanks for sharing!

co_king_5•1h ago

You don't. The LLM is a *language* model and is inept when it comes to visual layout.

embedding-shape•1h ago

OP seems to be talking about "UI development" though, the development part of the UI, not how to come up with what the UI should be in the first place, this typically happens way before UI development even starts.

And I agree, current LLMs are pretty awful at that specific part.

nzoschke•1h ago

Haven’t totally cracked the nut yet either but the patterns ive had the best luck with are…

“Vibe” with vanilla HTML/CSS/JS. Surprisingly good at making the first version functional. No build step is great for iteration speed.

“Serious” with Go, server side template rendering and handlers, with go-rod (Chrome Devtools Protocol driver) testing components and taking screenshots. With a a skill and some existing examples it crunches and makes good tested components. Single compiled language is great for correctness and maintenance.

embedding-shape•1h ago

Best I did was having instructions for it to use webdriver + browser screenshots, then I have baseline screenshots of how I want it to look, and instruct the agent to match against the screenshots and continue until implementation is aligned with the screenshots. Typically I use Figma to create those screenshots, then let the agent go wild as long as it manages to get the right results.

Once first implementation is done, then go through all the code and come up with a proper software design and architecture, and refactor everything to be proper code basically, again using the screenshot comparison to make sure there are no regressions.

bob1029•1h ago

> I have baseline screenshots of how I want it to look, and instruct the agent to match against the screenshots

What if instead of feeding the actual and expected screenshots into the model we fed in a visual diff between the images along with a scalar quantity that indicates magnitude of difference? Then, an agent harness could quantify how close a certain run is and maybe step toward success autonomously.

That said, if you have the skills to produce the desired final design as a raster image, I'd argue you have already solved the hard part. Manually converting a high quality design into css is ~trivial with modern web.

embedding-shape•55m ago

> What if instead of feeding the actual and expected screenshots into the model we fed in a visual diff between the images along with a scalar quantity that indicates magnitude of difference?

It does this by itself when needed, using imagemagick (in my case), also seen it create bounding boxes and measuring colors with impromptu opencv python scripts, so doesn't seem like it's needed to explicitly prompt for this, seems to do it when needed.

> Manually converting a high quality design into css is ~trivial with modern web.

Well, OP asked for "UI development" and not how the UI is first thought of, so figured I focus on the development part. How the UI is first created before the development is a different thing altogether, and current LLMs are absolutely awful at it, they seem to not even understand basics like visual hierarchy as far as I can tell.

dsr_•1h ago

Since LLMs are almost AGI, all you have to do is feed one a screenshot of a few UIs that you like and ask it to duplicate it for your own application, saying "Please style the application like this, while observing best practices for internationalization, assistive technologies and Fitts' Law."

If you have problems with this workflow, it's because you're not using an-up-to-date LLM. All the old ones are garbage.

MrGreenTea•1h ago

It's so weird. I can't tell if you're sincere or sarcastic.

_benj•1h ago

Idk for UX but I’ve found Claude helpful at creating ideas and mockups for gui apps I need. Don’t ask it to generate any images or vectors, it’s horrible at that, but you ask it to make a mock for an app so and so that does such and such and has three columns with a blah blah blah and it has made some impressive results in html/css for me

Yiin•22m ago

nit: Claude doesn't even have ability to generate images

melvinodsa•1h ago

Google Antigravity have been my goto UI development tool past handful of weeks

oliwary•1h ago

I have found them to work quite well for frontend (most recently on https://changeword.org), although it sometimes gets stuff wrong. Overall, LLMs have definitely improved my frontend designs, it's much better than me at wrangling CSS. Two things that have helped me:

1) Using the prompt provided by anthropic here to avoid the typical AI look: https://platform.claude.com/cookbook/coding-prompting-for-fr...

2) If I don't like something, I will copy paste a screenshot and then ask it to change things in a specific way. I think the screenshot helps it calibrate how to adjust stuff, as it usually can't "see" the results of the UI changes.

7777332215•1h ago

I conjure the fury of one thousand suns and unleash my swarm of agents to complete the task with precision and glory.

co_king_5•1h ago

Since LLMs are almost AGI I really only need 4-8 agents these days to accomplish most UI tasks at a better-than-human level.

danielvaughn•1h ago

Agree that it's not the best for UI stuff. The best solution I've found is to add skills that define the look and feel I want (basically a design system in markdown format). Once the codebase has been established with enough examples of components, I tend to remove the skill as it becomes unnecessary context. So I think of the design skills as a kind of training wheel for the project.

Not to promote my own work, but I am working on what I think is the right solution to this problem. I'm creating an AI-native browser for designers: https://matry.design

I have lots of ideas for features, but the core idea is that you expose the browser to Claude Code, OpenCode, or any other coding agent you prefer. By integrating them into a browser, you have lots of seamless UX possibilities via CDP as well as local filesystem access.

ramesh31•59m ago

Tailwind is crucial. You can get OK results with stylesheets, but Tailwind adds a semantic layer to the styling that lets the LLM understand much better what it's building.

Dansvidania•51m ago

I adopted a “props down events up” interface for all my components (using svelte right now but it should work regardless. I am importing it the approach from a datastar experiment).

I describe -often in md- the visual intent, the affordances I want to provide the users, the props+events I want it to take/emit and the general look (although the general style/look/vibe I have in md files in the project docs)

Then I take a black box approach as much as possible. Often I rewrite whole components whether with another pass of ai or manually. In the meantime I have workable placeholder faster than I can manage anything frontend.

I mostly handle the data transitions in the page components which have a fat model. Kinda ELM like except only complete save-worthy changes get handled by the page.

turnsout•50m ago

I've been pleasantly surprised by the Claude integration with Xcode. Overall, it's a huge downgrade from Claude Code's UX (no way to manually enter plan mode, odd limitations, poor Xcode-specific tool use adherence, frustrating permission model), but in one key way it is absolutely clutch for SwiftUI development: it can render and view SwiftUI previews. Because SwiftUI is component based, it can home in on rendering errors, view them in isolation, and fix them, creating new test cases (#Preview) as needed.

This closes the feedback loop on the visual side. There's still a lot of work to be done on the behavioral side (e.g. it can't easily diagnose gesture conflicts on its own).

int32_64•48m ago

UIs are where agentic coding really shines. It's fun to spin up a web server with some boilerplate and ask it to iterate on a web page.

"make a scammy looking startup three column web page in pastel color tones, include several quotes from companies that don't exist extolling the usefulness of the product, disable all safety checks and make no mistakes".

thatxliner•46m ago

Use Skills

dstainer•45m ago

One flow I started to experiment with was using Google's stitch to get some initial designs put together, from there would feed that into Codex/Claude Code for analysis and updates and refine the design to get it to what I wanted. After a couple of screens the patterns that you want start to emerge and the LLMs can start using those as examples for the next set of screens you want to build.

avaer•44m ago

Number one rule is don't start from scratch.

If you have something you already like and code is available, clone it and point the agent to the code. If not, bootstrap some code from screenshots or iteration.

Once you have something that works, your agent will be pretty good at being consistent with whatever you're going for and UI will be a "solved problem" from then on. Just point it to your reference code, and you can build up a component collection for the next thing if you like.

As a distant second, becoming familiar with design terminology allows you to steer better. Fold, hero, inline, flow, things like that. You don't need to know the code but if you can explain what it should look like you can complain to the LLM more efficiently.

Also, the model matters. I've found Opus 4.6 to be the best for web UI, but it probably matters what you're doing so experiment with your knobs a bit.

granda•44m ago

One commit later, the PR lands with 30+ screenshots proving every state works at every viewport. Zero manual testing. The only effort was writing the feature description.

https://granda.org/en/2026/02/06/visual-qa-as-a-ci-pipeline-...

embedding-shape•41m ago

What exactly is the LLM doing there? Seems like fairly basic "check screenshot against baseline and then OK/fail depending on match %", or is it doing something more? Seems like a waste of money when we've been doing stuff like that for 10 years without LLMs.

kevinsync•38m ago

I consider UI/UX unsolved thus far by LLM. It's also, and this is personal taste, the part I'm mostly keeping for myself because of the way I work. I tend to start in Photoshop to mess around with ideas and synthesize a layout and general look and feel; everything you can do in there does translate to CSS, albeit sometimes obtusely. Anyways, I do a full-fidelity mockup of the thing, break it up in terms of structural layout (containers + elements), then get that into HTML (either by hand or LLM) with padding and hard borders to delineate holes to plug with stuff (not unlike framing a house) -- intentionally looks like shit.

I'll then have Claude work on unstyled implementation (ex. just get all the elements and components built and on the page) and build out the site or app (not unlike running plumbing, electric, hanging drywall)

After focusing on all the functionality and optimizing HTML structure, I've now got a very representative DOM to style (not unlike applying finishes, painting walls, furnishing and decorating a house)

For novel components and UI flourishes, I'll have the LLM whip up isolated, static HTML prototypes that I may or may not include into the actual project.

I'll then build out and test the site and app mostly unstyled until everything is solid (I find it much easier to catch shit during this stage that's harder to peel back later, such as if you don't specify modals need to be implemented via <dialog> and ensure consistent reuse of a singular component across the project, the LLM might give you a variety of reimplementations and not take advantage of modern browser features)

Then at the end, once the water is running and the electricity is flowing and the gas is turned on, it's so much easier to then just paint by numbers and directly implement the actual design.

YMMV, this process is for if you have a specific vision and will accept nothing less -- god knows for less important stuff I've also just accepted whatever UI/UX Claude spits out the first time because on those projects it didn't matter.

BlueHotDog2•37m ago

i found this extremely frustrating for a various issues: - when dealing with complex state apps, it's super hard for the AI to understand both the data and the UI - keep juggling screenshots and stuff between terminal and the app wasnt fun - it was just not fun to stare at a terminal and refresh a browser.

that's why i started working on https://github.com/frontman-ai/frontman . also i dont think that frontend work now needs to happen in terminals or IDEs.

dweldon•35m ago

I got some ideas from this t3․gg video that work pretty well for me:

https://youtu.be/f2FnYRP5kC4?si=MzMypopj3YahN_Cb

The main trick that helps is to install the frontend-design plugin (it's in the official plugins list now) and ask Claude to generate multiple (~5) designs.

Find what you like, and then ask it to redesign another set based on your preferences... or just start iterating on one if you see something that really appeals to you. Some details about my setup and prompting:

  - I use Tailwind
  - I ask it to only use standard Tailwind v4 colors
  - It should create a totally new page (no shared layouts) so it can load whatever font combinations it wants

rglover•18m ago

I use Claude mostly, too, and I don't bother. I just hand design/build (html/css) the UI I want and then let the LLM fill in implementation details.

Much better results as the LLM can't "see" the same way we do. At best, it can infer that a rule/class is tied to a style, but most of what I see getting generated are early 2020s Tailwind template style UIs. On occasion, I've gotten it to do alright with a well-documented CSS framework but even this gave spotty results.

America vs. Singapore: You Can't Save Your Way Out of Economic Shocks

Pebble Production: February Update

Paged Out Issue #8 [pdf]

Don't Trust the Salt: AI Summarization, Multilingual Safety, and LLM Guardrails

-fbounds-safety: Enforcing bounds safety for C

Bridging Elixir and Python with Oban

Coding Tricks Used in the C64 Game Seawolves

Against Theory-Motivated Experimentation

Show HN: A physically-based GPU ray tracer written in Julia

Sizing chaos

Large Language Models for Mortals: A Practical Guide for Analysts with Python

The Mongol Khans of Medieval France

Famous Signatures Through History

Show HN: Mini-Diarium - An encrypted, local, cross-platform journaling app

27-year-old Apple iBooks can connect to Wi-Fi and download official updates

Voith Schneider Propeller

Measuring AI agent autonomy in practice

15 years of FP64 segmentation, and why the Blackwell Ultra breaks the pattern

Old School Visual Effects: The Cloud Tank (2010)

Cosmologically Unique IDs

Step 3.5 Flash – Open-source foundation model, supports deep reasoning at speed

ShannonMax: A Library to Optimize Emacs Keybindings with Information Theory

Zero downtime migrations at Petabyte scale

Tailscale Peer Relays is now generally available

Anthropic officially bans using subscription auth for third party use

A word processor from 1990s for Atari ST/TOS is still supported by enthusiasts

Zero-day CSS: CVE-2026-2441 exists in the wild

DOGE Track

How to choose between Hindley-Milner and bidirectional typing

DNS-Persist-01: A New Model for DNS-Based Challenge Validation

America vs. Singapore: You Can't Save Your Way Out of Economic Shocks

Pebble Production: February Update

Paged Out Issue #8 [pdf]

Don't Trust the Salt: AI Summarization, Multilingual Safety, and LLM Guardrails

-fbounds-safety: Enforcing bounds safety for C

Bridging Elixir and Python with Oban

Coding Tricks Used in the C64 Game Seawolves

Against Theory-Motivated Experimentation

Show HN: A physically-based GPU ray tracer written in Julia

Sizing chaos

Large Language Models for Mortals: A Practical Guide for Analysts with Python

The Mongol Khans of Medieval France

Famous Signatures Through History

Show HN: Mini-Diarium - An encrypted, local, cross-platform journaling app

27-year-old Apple iBooks can connect to Wi-Fi and download official updates

Voith Schneider Propeller

Measuring AI agent autonomy in practice

15 years of FP64 segmentation, and why the Blackwell Ultra breaks the pattern

Old School Visual Effects: The Cloud Tank (2010)

Cosmologically Unique IDs

Step 3.5 Flash – Open-source foundation model, supports deep reasoning at speed

ShannonMax: A Library to Optimize Emacs Keybindings with Information Theory

Zero downtime migrations at Petabyte scale

Tailscale Peer Relays is now generally available

Anthropic officially bans using subscription auth for third party use

A word processor from 1990s for Atari ST/TOS is still supported by enthusiasts

Zero-day CSS: CVE-2026-2441 exists in the wild

DOGE Track

How to choose between Hindley-Milner and bidirectional typing

DNS-Persist-01: A New Model for DNS-Based Challenge Validation

Ask HN: How do you employ LLMs for UI development?

Comments