frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

I failed to recreate the 1996 Space Jam Website with Claude

https://j0nah.com/i-failed-to-recreate-the-1996-space-jam-website-with-claude/
229•thecr0w•6h ago•193 comments

The C++ standard for the F-35 Fighter Jet [video]

https://www.youtube.com/watch?v=Gv4sDL9Ljww
149•AareyBaba•5h ago•141 comments

Evidence from the One Laptop per Child Program in Rural Peru

https://www.nber.org/papers/w34495
52•danso•3h ago•19 comments

Mechanical power generation using Earth's ambient radiation

https://www.science.org/doi/10.1126/sciadv.adw6833
10•defrost•1h ago•4 comments

Google Titans architecture, helping AI have long-term memory

https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/
344•Alifatisk•11h ago•109 comments

Dollar-stores overcharge cash-strapped customers while promising low prices

https://www.theguardian.com/us-news/2025/dec/03/customers-pay-more-rising-dollar-store-costs
185•bookofjoe•8h ago•263 comments

An Interactive Guide to the Fourier Transform

https://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/
116•pykello•5d ago•14 comments

A two-person method to simulate die rolls

https://blog.42yeah.is/algorithm/2023/08/05/two-person-die.html
36•Fraterkes•2d ago•19 comments

XKeyscore

https://en.wikipedia.org/wiki/XKeyscore
75•belter•2h ago•57 comments

Build a DIY magnetometer with a couple of seasoning bottles

https://spectrum.ieee.org/listen-to-protons-diy-magnetometer
53•nullbyte808•1w ago•13 comments

Bag of words, have mercy on us

https://www.experimental-history.com/p/bag-of-words-have-mercy-on-us
4•ntnbr•57m ago•1 comments

The Anatomy of a macOS App

https://eclecticlight.co/2025/12/04/the-anatomy-of-a-macos-app/
168•elashri•10h ago•41 comments

The state of Schleswig-Holstein is consistently relying on open source

https://www.heise.de/en/news/Goodbye-Microsoft-Schleswig-Holstein-relies-on-Open-Source-and-saves...
494•doener•10h ago•234 comments

Scala 3 slowed us down?

https://kmaliszewski9.github.io/scala/2025/12/07/scala3-slowdown.html
154•kmaliszewski•8h ago•87 comments

Java Hello World, LLVM Edition

https://www.javaadvent.com/2025/12/java-hello-world-llvm-edition.html
159•ingve•11h ago•54 comments

Proxmox delivers its software-defined datacenter contender and VMware escape

https://www.theregister.com/2025/12/05/proxmox_datacenter_manager_1_stable/
28•Bender•2h ago•1 comments

Nested Learning: A new ML paradigm for continual learning

https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/
55•themgt•8h ago•2 comments

Estimates are difficult for developers and product owners

https://thorsell.io/2025/12/07/estimates.html
128•todsacerdoti•4h ago•148 comments

Minimum Viable Arduino Project: Aeropress Timer

https://netninja.com/2025/12/01/minimum-viable-arduino-project-aeropress-timer/
3•surprisetalk•5d ago•0 comments

iced 0.14 has been released (Rust GUI library)

https://github.com/iced-rs/iced/releases/tag/0.14.0
40•airstrike•2h ago•21 comments

Semantic Compression (2014)

https://caseymuratori.com/blog_0015
46•tosh•6h ago•5 comments

Over fifty new hallucinations in ICLR 2026 submissions

https://gptzero.me/news/iclr-2026/
434•puttycat•10h ago•338 comments

Syncthing-Android have had a change of owner/maintainer

https://github.com/researchxxl/syncthing-android/issues/16
100•embedding-shape•3h ago•23 comments

Z2 – Lithographically fabricated IC in a garage fab

https://sam.zeloof.xyz/second-ic/
327•embedding-shape•20h ago•73 comments

Context Plumbing (Interconnected)

https://interconnected.org/home/2025/11/28/plumbing
5•gmays•5d ago•0 comments

Building a Toast Component

https://emilkowal.ski/ui/building-a-toast-component
77•FragrantRiver•4d ago•28 comments

The programmers who live in Flatland

https://blog.redplanetlabs.com/2025/11/24/the-programmers-who-live-in-flatland/
69•winkywooster•1w ago•86 comments

The past was not that cute

https://juliawise.net/the-past-was-not-that-cute/
388•mhb•1d ago•476 comments

Screenshots from developers: 2002 vs. 2015 (2015)

https://anders.unix.se/2015/12/10/screenshots-from-developers--2002-vs.-2015/
435•turrini•1d ago•215 comments

How the Disappearance of Flight 19 Fueled the Legend of the Bermuda Triangle

https://www.smithsonianmag.com/history/how-the-disappearance-of-flight-19-a-navy-squadron-lost-in...
45•pseudolus•11h ago•12 comments
Open in hackernews

I failed to recreate the 1996 Space Jam Website with Claude

https://j0nah.com/i-failed-to-recreate-the-1996-space-jam-website-with-claude/
227•thecr0w•6h ago

Comments

stonecharioteer•5h ago
I'm curious. Did you ask it to use tables and no CSS?

In 1996, We had only css1. Ask it to use tables to do this, perhaps.

lagniappe•5h ago
We actually had a myriad of competing specs. CSS proper wasn't released until december 1996.
thecr0w•5h ago
Any names for the competing specs? Maybe i could try re-prompting with that direction.
lagniappe•5h ago
Models are trained with content scraped from the net, for the most part. The availability of content pertaining to those specs is almost nil, and of no SEO value. Ergo, models for the most part will only have a cursory knowledge of a spec that your browser will never be able to parse because that isn't the spec that won.

Nonetheless, here is a link to a list of the specs you asked for: https://www.w3.org/Style/History/Overview.en.html

boie0025•3h ago
Thanks for sharing that. I read through a lot of this. Interesting to read those perspectives in the context of today.
lagniappe•3h ago
Much obliged. Have a good weekend. Your new gray hairs are en route :)
wanderingstan•4h ago
There were specs competing for adoption, but only tables (the old way) and CSS were actually adopted by browsers. So no point trying to use some other positioning technique.
thecr0w•5h ago
Yes yes great question!

I tried your suggestion and also tried giving it various more general versions of the limitations presented by earlier generations.

Claude's instinct initially was actually to limit itself to less modern web standards.

Unfortunately, nothing got those planets to be in the right place.

Wowfunhappy•5h ago
Claude is not very good at using screenshots. The model may technically be multi-modal, but its strength is clearly in reading text. I'm not surprised it failed here.
fnordpiglet•4h ago
Especially since it decomposes the image into a semantic vector space rather than the actual grid of pixels. Once the image is transformed into patch embeddings all sense of pixels is entirely destroyed. The author demonstrates a profound lack of understanding for how multimodal LLMs function that a simple query of one would elucidate immediately.

The right way to handle this is not to build it grids and whatnot, which all get blown away by the embedding encoding but to instruct it to build image processing tools of its own and to mandate their use in constructing the coordinates required and computing the eccentricity of the pattern etc in code and language space. Doing it this way you can even get it to write assertive tests comparing the original layout to the final among various image processing metrics. This would assuredly work better, take far less time, be more stable on iteration, and fits neatly into how a multimodal agentic programming tool actually functions.

mcbuilder•3h ago
Yeah, this is exactly what I was thinking. LLMs don't have precise geometrical reasoning from images. Having an intuition of how the models work is actually.a defining skill in "prompt engineering"
dcanelhas•4h ago
Even with text, parsing content in 2D seems to be a challenge for every LLM I have interacted with. Try getting a chatbot to make an ascii-art circle with a specific radius and you'll see what I mean.
Wowfunhappy•2h ago
I don't really consider ASCII art to be text. It requires a completely different type of reasoning. A blind person can be understand text if it's read out loud. A blind person really can't understand ASCII art if it's read out loud.
throwaway314155•5h ago
Somehow I suspect Claude Code (in an interactive session with trial, error, probing, critiquing, perusing, and all the other benefits you get) would do better. This example seems to assume Claude can do things in "one shot" (even the later attempts all seem to conceal information like it's a homework assignment).

That's not how to successfully use LLM's for coding in my experience. It is however perhaps a good demonstration of Claude's poor spatial reasoning skills. Another good demonstration of this is the twitch.tv/ClaudePlaysPokemon where Claude has been failing to beat pokemon for months now.

thecr0w•5h ago
Not a homework assignment, and no deliberate attempt to conceal information, just very long and repetitive logs. A lot of the same "insights" so I just didn't provide them here.

> That's not how to successfully use LLM's for coding in my experience.

Yeah agree. I think I was just a little surprised it couldn't one-shot given the simplicity.

CharlesW•5h ago
Using https://github.com/anthropics/claude-code/tree/main/plugins/... with style-supporting instructions and context would've improved the outcome as well.
thecr0w•5h ago
thank you! I'll try this
aidos•4h ago
Is the skill effectively just adding a little extra context here though? Doesn’t strike me as the sort of context that would improve the outcome.

https://github.com/anthropics/claude-code/blob/main/plugins/...

999900000999•5h ago
Space Jam website design as an LLM benchmark.

This article is a bit negative. Claude gets close , it just can't get the order right which is something OP can manually fix.

I prefer GitHub Copilot because it's cheaper and integrates with GitHub directly. I'll have times where it'll get it right, and times when I have to try 3 or 4 times.

thecr0w•5h ago
ya, this is true. Another commenter also pointed out that my intention was to one-shot. I didn't really go too deeply into trying to try multiple iterations.

This is also fairly contrived, you know? It's not a realistic limitation to rebuild HTML from a screenshot because of course if I have the website loaded I can just download the HTML.

Retric•4h ago
It’s not unrealistic to want to revert to an early version of something you only have a screenshot of.
swatcoder•4h ago
> rebuild HTML from a screenshot

???

This is precisely the workflow when a traditional graphic designer mocks up a web/app design, which still happens all the time.

They sketch a design in something like Photoshop or Illustrator, because they're fluent in these tools and many have been using them for decades, and somebody else is tasked with figuring out how to slice and encode that design in the target interactive tech (HTML+CSS, SwiftUI, QT, etc).

Large companies, design agencies, and consultancies with tech-first design teams have a different workflow, because they intentionally staff graphic designers with a tighter specialization/preparedness, but that's a much smaller share of the web and software development space than you may think.

There's nothing contrived at all about this test and it's a really great demonstration of how tools like Claude don't take naturally to this important task yet.

thecr0w•4h ago
You know, you're totally right and I didn't even think about that.
smallnix•5h ago
That's not the point of the article. It's about Claude/LLM being overconfident in recreating pixel perfect.
jacquesm•1h ago
All AI's are overconfident. It's impressive what they can do, but it is at the same time extremely unimpressive what they can't do while passing it off as the best thing since sliced bread. 'Perfect! Now I see the problem.'. 'Thank you for correcting that, here is a perfect recreation of problem 'x' that will work with your hardware.' (never mind the 10 glaring mistakes).

I've tried these tools a number of times and spent a good bit of effort on learning to maximize the return. By the time you know what prompt to write you've solved the problem yourself.

GeoAtreides•4h ago
>which is something OP can manually fix

what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong? that's the main issue: if it fails here, it will fail with other things, in not such obvious ways.

godelski•1h ago
I think that's the main problem with them. It is hard to figure out when they're wrong.

As the post shows, you can't trust them when they think they solved something but you also can't trust them when they think they haven't[0]. The things are optimized for human preference, which ultimately results in this being optimized to hide mistakes. After all, we can't penalize mistakes in training when we don't know the mistakes are mistakes. The de facto bias is that we prefer mistakes that we don't know are mistakes than mistakes that we do[1].

Personally I think a well designed tool makes errors obvious. As a tool user that's what I want and makes tool use effective. But LLMs flip this on the head, making errors difficult to detect. Which is incredibly problematic.

[0] I frequently see this in a thing it thinks is a problem but actually isn't, which makes steering more difficult.

[1] Yes, conceptually unknown unknowns are worse. But you can't measure unknown unknowns, they are indistinguishable from knowns. So you always optimize deception (along with other things) when you don't have clear objective truths (most situations).

bigstrat2003•4h ago
> it just can't get the order right which is something OP can manually fix.

If the tool needs you to check up on it and fix its work, it's a bad tool.

markbao•3h ago
“Bad” seems extreme. The only way to pass the litmus test you’ve described is for a tool to be 100% perfect, so then the graph looks like 99.99% “bad tool” until it reaches 100% perfection.

It’s not that binary imo. It can still be extremely useful and save a ton of time if it does 90% of the work and you fix the last 10%. Hardly a bad tool.

It’s only a bad tool if you spent more time fixing the results than building it yourself, which sometimes used to be the case for LLMs but is happening less and less as they get more capable.

a4isms•3h ago
If you show me a tool that does a thing perfectly 99% of the time, I will stop checking it eventually. Now let me ask you: How do you feel about the people who manage the security for your bank using that tool? And eventually overlooking a security exploit?

I agree that there are domains for which 90% good is very, very useful. But 99% isn't always better. In some limited domains, it's actually worse.

999900000999•1h ago
Counterpoint.

Humans don't get it right 100% or the time.

wvenable•3h ago
Perfection is the enemy of good.
mrweasel•3h ago
That's my thinking. If I need to check up on the work, then I'm equally capable of writing the code myself. It might go faster with an LLM assisting me, and that feels perfectly fine. My issue is when people use the AI tools to generate something far beyond their own capabilities. In those cases, who checks the result?
godelski•25m ago
I wouldn't go that far, but I do believe good tool design tries to make its failure modes obvious. I like to think of it similar to encryption: hard to do, easy to verify.

All tools have failure modes and truthfully you always have to check the tool's work (which is your work). But being a master craftsman is knowing all the nuances behind your tools, where they work, and more importantly where they don't work.

That said, I think that also highlights the issue with LLMs and most AI. Their failure modes are inconsistent and difficult to verify. Even with agents and unit tests you still have to verify and it isn't easy. Most software bugs are created from subtle things, often which compound. Which both those things are the greatest weaknesses of LLMs: nuance and compounding effects.

So I still think they aren't great tools, but I do think they can be useful. But that also doesn't mean it isn't common for people to use them well outside the bounds of where they are generally useful. It'll be fine a lot of times, but the problem is that it is like an alcohol fire[0]; you don't know what's on fire because it is invisible. Which, after all, isn't that the hardest part of programming? Figuring out where the fire is?

[0] https://www.youtube.com/watch?v=5zpLOn-KJSE

supern0va•5h ago
Honestly, if you had showed this article to me even eighteen months ago, I would have been blown away at how good of a job Claude did.

It's remarkable how high our expectations have been steadily creeping.

WhyOhWhyQ•5h ago
This comment is missing the point. The real goal of all this is not to amaze. It's to create better software. Let's graduate past the amazement phase into the realism phase as soon as possible. What parts of my project is the LLM for? That is the real question worth asking.
supern0va•5h ago
Oh, to be clear, this isn't a criticism. I think it's super cool that we're moving onto the nitpick/refinement phase of this tech. :)
dreadnip•5h ago
Why involve an LLM in this? Just download the site?
ChrisArchitect•4h ago
Yeah, Internet Archive has lots of copies https://web.archive.org/web/20250000000000*/https://www.spac... also

What's with the panicked pleas and need to preserve the site, assuming locally...?

GeoAtreides•4h ago
The post is clearly about something else than preserving https://www.spacejam.com/1996/

It seems to me the post is about how Claude fails to recreate a very simple website from 1996.

sigseg1v•5h ago
Curious if you've tested something such as:

- "First, calculate the orbital radius. To do this accurately, measure the average diameter of each planet, p, and the average distance from the center of the image to the outer edge of the planets, x, and calculate the orbital radius r = x - p"

- "Next, write a unit test script that we will run that reads the rendered page and confirms that each planet is on the orbital radius. If a planet is not, output the difference you must shift it by to make the test pass. Use this feedback until all planets are perfectly aligned."

turnsout•4h ago
Yes, this is a key step when working with an agent—if they're able to check their work, they can iterate pretty quickly. If you're in the loop, something is wrong.

That said, I love this project. haha

monsieurbanana•4h ago
I'm trying to understand why this comment got downvoted. My best guess is that "if you're in the loop, something is wrong" is interpreted as there should be no human involvement at all.

The loop here, imo, refers to the feedback loop. And it's true that ideally there should be no human involvement there. A tight feedback loop is as important for llms as it is for humans. The more automated you make it, the better.

turnsout•2h ago
Yes, maybe I goofed on the phrasing. If you're in the feedback loop, something is wrong. Obviously a human should be "in the loop" in the sense that they're aware of and reviewing what the agent is doing.
Aurornis•4h ago
This is my experience with using LLMs for complex tasks: If you're lucky they'll figure it out from a simple description, but to get most things done the way you expect requires a lot of explicit direction, test creation, iteration, and tokens.

One of the keys to being productive with LLMs is learning how to recognize when it's going to take much more effort to babysit the LLM into getting the right result as opposed to simply doing the work yourself.

jacquesm•1h ago
It would not be the first time that an IT services provider makes more money the worse their products perform.
jazzyjackson•24m ago
Re: tokens, there is a point where you have to decide what's worth it to you. I'd been unimpressed with what I could get out of chat apps but when I wanted to do a rails app that would cost me thousands in developer time and some weeks between communication zoom meetings and iteration... I bit the bullet and kept topping up Claude API and spent about $500 on Opus over the course of a weekend, but the site is done and works great.
thecr0w•4h ago
Hm, I didn't try exactly this, but I probably should!

Wrt unit test script, let's take Claude out of the equation, how would you design the unit test? I kept running into either Claude or some library not being capable of consistently identifying planet vs non planet which was hindering Claude's ability to make decisions based on fine detail or "pixel coordinates" if that makes sense.

cfbradford•2h ago
Do you give Claude the screenshot as a file? If so I’d just ask it to write a tool to diff each asset to every possible location in the source image to find the most likely position of each asset. You don’t really need recognition if you can brute force the search. As a human this is roughly what I would do if you told me I needed to recreate something like that with pixel perfect precision.
bluedino•1h ago
Congratulations, we finally created 'plain English' programming languages. It only took 1/10th of the worlds electricity and 40% of the semiconductor production.
zitterbewegung•5h ago
In actual workflows someone would accept a very close reproduction and fix the small issues. Generally I use systems to get close enough to a scaffolding and / or make small incremental improvements and direct its design
micromacrofoot•5h ago
I wouldn't call it entirely defeated, it got maybe 90% of the way there. Before LLMs you couldn't get 50% of the way there in an automated way.

> What he produces

I feel like personifying LLMs more than they currently are is a mistake people make (though humans always do this), they're not entities, they don't know anything. If you treat them too human you might eventually fool yourself a little too much.

thecr0w•3h ago
As a couple other comments pointed out, it's also not fair to judge Claude based on a one shot like this. I sort of assume these limitations will remain even if we went back and forth but to be fair, I didn't try that more than a few times in this investigation. Maybe on try three it totally nails it.
bdcravens•4h ago
A comparison would Codex would be good. I haven't done it with Codex, but when working through problems using ChatGPT, it does a great job when given screenshots.
daemonologist•4h ago
Interesting - these models are all trained to do pixel-level(ish) measurement now, for bounding boxes and such. I wonder if you could railroad it into being accurate with the right prompt.
Lerc•4h ago
What models are good at this? I have tried passing images to models and asking them for coordinates for specific features, then overlaid dots on those points and passed that image back to the model so it has a perception of how far out it was. It had a tendency to be consistently off by a fixed amount without getting closer.

I don't doubt that it is possible eventually, but I haven't had much luck.

Something that seemed to assist was drawing a multi coloured transparent chequerboard, if the AI knows the position of the grid colours it can pick out some relative information from the grid.

daemonologist•4h ago
I've found Qwen3-VL to be fairly accurate at detection (though it doesn't always catch every instance). Note that it gives answers as per-mille-ages, as if the image was 1000x1000 regardless of actual resolution or aspect ratio.

I have also not had luck with any kind of iterative/guess-and-check approach. I assume the models are all trained to one-shot this kind of thing and struggle to generalize to what are effectively relative measurements.

sdenton4•4h ago
Feels like the "right" approach would be to have it write some code to measure how far off the elements are in the original vs recreated image, and then iterate using the numerical output of the program...
johncoatesdev•4h ago
You last-minute cancelled coffee with your friends to work on this? I'm not sure how I would feel if a friend did that to me.
fishtoaster•4h ago
Based on the later life updates, I suspect this was being humorous.

> After these zoom attempts, I didn't have any new moves left. I was being evicted. The bank repo'd my car. So I wrapped it there.

syassami•4h ago
We've lost the capability to build such marvels.

https://knowyourmeme.com/memes/my-father-in-law-is-a-builder...

barfoure•4h ago
Lost it at wooden structures at the playground.
a-dub•4h ago
maybe ask it to use 1990s table based layout approaches?
al_borland•3h ago
Interesting. I just looked at the page source and it is in fact using a table layout. I always assumed it was an image map, which I assume would be even more obscure for the LLM.
thecr0w•3h ago
We should check the Wayback Machine, but in my memory this was built with an image map. Maybe like, 10 years ago or something. I was googling around when writing this post and saw that there are folks still tasked with making sure it's up and running. I wonder if they migrated it to tables at some point in the last decade.
smoghat•4h ago
Ok, so here is an interesting case where Claude was almost good enough, but not quite. But I’ve been amusing myself by taking abandoned Mac OS programs from 20 years ago that I find on GitHub and bringing them up to date to work on Apple silicon. For example, jpegview, which was a very fast and simple slideshow viewer. It took about three iterations with Claude code before I had it working. Then it was time to fix some problems, add some features like playing videos, a new layout, and so on. I may be the only person in the world left who wants this app, but well, that was fine for a day long project that cooked in a window with some prompts from me while I did other stuff. I’ll probably tackle scantailor advanced next to clean up some terrible book scans. Again, I have real things to do with my time, but each of these mini projects just requires me to have a browser window open to a Claude code instance while I work on more attention demanding tasks.
egeozcan•4h ago
Side note: As a person who started using a mac since march, I found phoenix slides really good.
smoghat•2h ago
It is! I was really just curious if I could update this old codebase without getting my hands dirty.
skrebbel•3h ago
> Ok, so here is an interesting case where Claude was almost good enough, but not quite.

You say that as if that’s uncommon.

jonplackett•2h ago
This should be the strap line for all AI (so far)
smoghat•2h ago
That's fair. But I always think of it as an intern I am paying $20 a month for or $200 a month. I would be kind of shocked if they could do everything as well as I'd hoped for that price point. It's fascinating for me and worth the money.

I am lucky that I don't depend on this for work at a corporation. I'd be pulling my hair out if some boss said "You are going to be doing 8 times as much work using our corporate AI from now on."

jonplackett•56m ago
Don get me wrong, doing 80% of my work for me is still great. And I’m actually quite glad I’m still needed for the other 20%
pluc•4h ago
I like how the author calls a script on the internet "him".
thecr0w•4h ago
lol
NooneAtAll3•3h ago
better than using a plural for a single entity
shwaj•3h ago
“it”
lillesvin•20m ago
Come on, just stop. "They" have been used to refer to singular antecedents since the 14th century. (Source: https://www.oed.com/discover/a-brief-history-of-singular-the...)
docheinestages•4h ago
> Note: please help, because I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot.

Why not use wget to mirror the website? Unless you're being sarcastic.

$ wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.org

Source: https://superuser.com/questions/970323/using-wget-to-copy-we...

thecr0w•4h ago
The stuff about not being able to download it is a bit of a joke and I don't think the tone landed with everybody haha. This was just an experiment to see if Claude could recreate a simple website from a screenshot, of course to your point you could download it if you wanted.
malfist•4h ago
Because that wasn't the goal of this exercise
hestefisk•3h ago
Would be interesting to see whether Gemini could crack this problem.
th0ma5•3h ago
I personally don't understand why asking these things to do things we know they can't do is supposed to be productive. Maybe for getting around restrictions or fuzzing... I don't see it as an effective benchmark unless it can link directly to the ways the models are being improved, but, to look at random results that sometimes are valid and think more iterations of randomness will eventually give way to control is a maddening perspective to me, but perhaps I need better language to describe this.
thecr0w•3h ago
I think this is a reasonable take. I think for me, I like to investigate limitations like this in order to understand where the boundaries are. Claude isn't impossibly bad at analyzing images. It's just pixel perfect corrections that seem to be a limitation. Maybe for some folks it's enough to just read that but for me, I like to feel like I have some good experiential knowledge about the limitations that I can keep in my brain and apply appropriately in the future.
iwontberude•3h ago
Apropos given Warner Brothers Discovery just sold to Netflix
soared•3h ago
I got quite close with Gemini 3 pro in AI studio. I uploaded a screenshot (no assets) and the results were similar to OP. It failed to follow my fix initially but I told it to follow my directions (lol) and it came quite close (though portrait mode distorted it, landscape was close to perfect.

“Reference the original uploaded image. Between each image in the clock face, create lines to each other image. Measure each line. Now follow that same process on the app we’ve created, and adjust the locations of each image until all measurements align exactly.”

https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

jacobsenscott•3h ago
> here's no other way to do it besides getting Claude to recreate it from a screenshot

And

> I'm an engineering manager

I can't tell if this is an intentional or unintentional satire of the current state of AI mandates from management.

thecr0w•3h ago
lololol
dmd•3h ago
i can’t tell if your comment is satire or not
chilmers•3h ago
You really can’t tell? Perhaps the bar for AGI is lower than I thought.
master_crab•3h ago
Honest question: does he know about F5? Or was it intentional to use screenshots when source is available?
thuttinger•3h ago
Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things. There are a lot of problems that are easy to get right for a junior web dev but impossible for an LLM. On the other hand, I was able to write a C program that added gamma color profile support to linux compositors that don't support it (in my case Hyprland) within a few minutes! A - for me - seemingly hard task, which would have taken me at least a day or more if I didn't let Claude write the code. With one prompt Claude generated C code that compiled on first try that:

- Read an .icc file from disk

- parsed the file and extracted the VCGT (video card gamma table)

- wrote the VCGT to the video card for a specified display via amdgpu driver APIs

The only thing I had to fix was the ICC parsing, where it would parse header strings in the wrong byte-order (they are big-endian).

littlecranky67•2h ago
> Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things

Because the rendered output (pixels, not HTML/CSS) is not fed as data in the training. You will find tons of UI snippets and questions, but they rarely included screenshots. And if they do, the are not scraped.

Wowfunhappy•2h ago
Interesting thought. I wonder if Anthropic et al could include some sort of render-html-to-screenshot as part of the training routine, such that the rendered output would get included as training data.
KaiserPro•41m ago
thats basically a VLM, but the problem is that describing the world requires a better understanding of the world. Hence why LeCunn is talking about world models (Its also cutting edge for teaching robots to manipulate and plan manipulations)
btown•26m ago
Even better, a tool that can tell the rendered bounding box of any set of elements, and what the distances between pairs of elements are, so it can make adjustments if relative positioning doesn't match its expectation. This would be incredible for SVG generation for diagrams, too.
jacquesm•1h ago
Claude didn't write that code. Someone else did and Claude took that code without credit to the original author(s), adapted it to your use case and then presented it as its own creation to you and you accepted this. If a human did this we probably would have a word for them.
Mtinie•1h ago
> If a human did this we probably would have a word for them.

I don’t think it’s fair to call someone who used Stack Overflow to find a similar answer with samples of code to copy to their project an asshole.

jacquesm•1h ago
Who brought Stack Overflow up? Stack Overflow does not magically generate code, someone has to actually provide it first.
Mtinie•1h ago
I generally agree with your underlying point concerning attribution and intellectual property ownership but your follow-up comment reframes your initial statement: LLMs generate recombinations of code from code created by humans, without giving credit.

Stack Overflow offers access to other peoples’ work, and developers combined those snippets and patterns into their own projects. I suspect attribution is low.

jacquesm•1h ago
Stack Overflow deals with that issue by having a license agreement.
sublinear•1h ago
Using stack overflow recklessly is definitely asshole behavior.
bluedino•1h ago
It has been for the last 15 years.
giancarlostoro•1h ago
You mean like copying and pasting code from Stack Overflow?
idiotsecant•1h ago
Yes, the word for that is software developer.
FanaHOVA•1h ago
Are you saying that every piece of code you have ever written contains a full source list of every piece of code you previously read to learn specific languages, patterns, etc?

Or are you saying that every piece of code you ever wrote was 100% original and not adapted from any previous codebase you ever worked in or any book / reference you ever read?

jacquesm•1h ago
What's with the bad takes in this thread. That's two strawmen in one comment, it's getting a bit crowded.
DangitBobby•1h ago
Or the original point doesn't actually hold up to basic scrutiny and is indistinguishable from straw itself.
jacquesm•34m ago
HN has guidelines for a reason.
incr_me•22m ago
You're adhering to an excess of rules, methinks!
bsaul•1h ago
That's an interesting hypothesis : that LLM are fundamentally unable to produce original code.

Do you have papers to back this up ? That was also my reaction when i saw some really crazy accurate comments on some vibe coded piece of code, but i couldn't prove it, and thinking about it now i think my intuition was wrong (ie : LLMs do produce original complex code).

jacquesm•1h ago
We can solve that question in an intuitive way: if human input is not what is driving the output then it would be sufficient to present it with a fraction of the current inputs, say everything up to 1970 and have it generate all of the input data from 1970 onwards as output.

If that does not work then the moment you introduce AI you cap their capabilities unless humans continue to create original works to feed the AI. The conclusion - to me, at least - is that these pieces of software regurgitate their inputs, they are effectively whitewashing plagiarism, or, alternatively, their ability to generate new content is capped by some arbitrary limit relative to the inputs.

andrepd•1h ago
Excellent observation.
andsoitis•59m ago
I like your test. Should we also apply to specific humans?

We all stand on the shoulders of giants and learn by looking at others’ solutions.

jacquesm•36m ago
That's true. But if we take your implied rebuttal then current level AI would be able to learn from current AI as well as it would learn from humans, just like humans learn from other humans. But so far that does not seem to be the case, in fact, AI companies do everything they can to avoid eating their own tail. They'd love eating their own tail if it was worth it.

To me that's proof positive they know their output is mangled inputs, they need that originality otherwise they will sooner or later drown in nonsense and noise. It's essentially a very complex game of Chinese whispers.

andsoitis•12m ago
I share that perspective.
fpoling•1h ago
Pick up a book about programming from seventies or eighties that was unlikely to be scanned and feed into LLM. Take a task from it and ask LLM to write a program from it that even a student can solve within 10 minutes. If the problem was not really published before, LLM fails spectacularly.
crawshaw•1h ago
This does not appear to be true. Six months ago I created a small programming language. I had LLMs write hundreds of small programs in the language, using the parser, interpreter, and my spec as a guide for the language. The vast majority of these programs were either very close or exactly what I wanted. No prior source existed for the programming language because I created it whole cloth days earlier.
jazzyjackson•30m ago
Obviously you accidentally recreated a language from the 70s :P

(I created a template language for JSON and added branching and conditionals and realized I had a whole programming language. Really proud of my originality until i was reading Ted Nelson's Computer Lib/Dream Machines and found out I reinvented TRAC, and to some extent, XSLT. Anyway LLMs are very good at reasoning about it because it can be constrained by a JSON schema. People who think LLMs only regurgitate haven't given it a fair shot)

zahlman•16m ago
FWIW, I think a JSON-based XSLT-like thing sounds far more enjoyable to use than actual XSLT, so I'd encourage you to show it off.
fpoling•30m ago
Languages with reasonable semantics are rather similar and LLMs are good at detecting that and adapting from other languages.
anjel•56m ago
Sometimes its generated, and many times its not. Trivial to denote, but its been deemed non of your business.
mlinsey•1h ago
Certainly if a human wrote code that solved this problem, and a second human copied and tweaked it slightly for their use case, we would have a word for them.

Would we use the same word if two different humans wrote code that solved two different problems, but one part of each problem was somewhat analogous to a different aspect of a third human's problem, and the third human took inspiration from those parts of both solutions to create code that solved a third problem?

What if it were ten different humans writing ten different-but-related pieces of code, and an eleventh human piecing them together? What if it were 1,000 different humans?

I think "plagiarism", "inspiration", and just "learning from" fall on some continuous spectrum. There are clear differences when you zoom out, but they are in degree, and it's hard to set a hard boundary. The key is just to make sure we have laws and norms that provide sufficient incentive for new ideas to continue to be created.

whatshisface•38m ago
They key difference between plagarism and building on someone's work is whether you say, "this based on code by linsey at github.com/socialnorms" or "here, let me write that for you."
CognitiveLens•22m ago
but as mlinsey suggests, what if it's influenced in small, indirect ways by 1000 different people, kind of like the way every 'original' idea from trained professionals is? There's a spectrum, and it's inaccurate to claim that Claude's responses are comparable to adapting one individual's work for another use case - that's not how LLMs operate on open-ended tasks, although they can be instructed to do that and produce reasonable-looking output.

Programmers are not expected to add an addendum to every file listing all the books, articles, and conversations they've had that have influenced the particular code solution. LLMs are trained on far more sources that influence their code suggestions, but it seems like we actually want a higher standard of attribution because they (arguably) are incapable of original thought.

sarchertech•13m ago
If the problem you ask it to solve has only one or a few examples, or if there are many cases of people copy pasting the solution, LLMs can and will produce code that would be called plagiarism if a human did it.
nextos•31m ago
In case of LLMs, due to RAG, very often it's not just learning but almost direct real-time plagiarism.
ekropotin•1h ago
> If a human did this we probably would have a word for them.

What do you mean? The programmers work is literally combining the existing patterns into solutions for problems.

Aeolun•1h ago
Software engineer? You think I cite all the code I’ve ever seen before when I reproduce it? That I even remember where it comes from?
fooker•42m ago
> If a human did this we probably would have a word for them.

Humans do this all the time.

goneskiiiing•19m ago
Please read anything about ml before commenting obvious nonsense like this.
nvllsvm•8m ago
> Someone else did

Who?

chongli•1h ago
Why is this something a Wayland compositor (a glorified window manager) needs to worry about? Apple figured this out back in the 1990s with ColorSync and they did it once for the Mac OS and any application that wanted colour management could use the ColorSync APIs.
hedgehog•1h ago
Color management infrastructure is intricate. To grossly simplify: somehow you need to connect together the profile and LUT for each display, upload the LUTs to the display controller, and provide appropriate profile data for each window to their respective processes. During compositing then convert buffers that don't already match the output (unmanaged applications will probably be treated as sRGB, color managed graphics apps will opt out of conversion and do whatever is correct for their purpose).
chongli•29m ago
Yes, but why is the compositor dealing with this? Shouldn't the compositor simply be deciding which windows go where (X, Y, and Z positions) and leave the rendering to another API? Why does every different take on a window manager need to re-do all this work?
zahlman•13m ago
I mean, when I hear the word "compositing" I definitely imagine something that involves "alpha" blending, and doing that nicely (instead of a literal alpha calculation) is going to involve colour management.
pfix•3h ago
I checked the source of the original (like maybe many of you) to check how they actually did it and it was... simpler than expected. I drilled myself so hard to forget tables as layout... And here it is. So simple it's a marvel.
COAGULOPATH•2h ago
And they do hacky things like space elements vertically using <br> tags.
fortyseven•3h ago
Look at that stupid dog. It's reading a book, but it's really trashy YA. It's not even Shakespeare. Dogs are stupid.
thecr0w•3h ago
lol, that is fair criticism
satisfice•3h ago
But the dog's owner keeps saying that it ONLY reads Shakespeare. The dog's owner wants millions of dollars for the dog on that basis.

I'm not mad at the dogs, I'm mad at the stupid investors and the lying owners.

vmg12•3h ago
We don't know how to build it anymore
stwsk•3h ago
>Look, I still need this Space Jam website recreated.

Now that's a novel sentence if I've ever read one.

nickdothutton•3h ago
I have recently been working on something "fun" in the terminal that mingles plain ASCII, ANSI "graphics", actual bitmaps (Sixel), and Nerdfonts in a TUI framework (Charm etc). After a week of smashing Claude's head against a wall, which is better than smashing my own, I've had to significantly alter my hopes and expectations.
isoprophlex•3h ago
I have a very weird tangential nit to pick: gendering LLMs. I swear I'm not pushing any sort of gender agenda/discussion that can be had anytime anywhere else in the current age, but to me there is something quintessentially a-gendered about the output of a computer program.

Calling Claude (or GPT-5 or Gemini or my bash terminal for that matter) a "he" seems absurd to the point of hilarity.

In my mind, they've always firmly been "it"s.

bojan•2h ago
This felt quirky to me as well, possibly because my native language is strictly gendered.
DocTomoe•2h ago
Hm, Claude is a common male surname, especially in Europe. That plays into it. Also many people - including me - have personalised their AI chats, have given it names, even something resembling a personality (it's easy with prefix prompts). Why others do it, who knows, I do it because I find it a lot less frustrating when ChatGPT fucks up when it pretends to be a young adult female klutz.
fragmede•17m ago
Sounds like the setup for a sexist comedian's routine. "Y'know, ChatGPT is totally a woman because she reminds me of my wife. She thinks it knows everything and is convinced she's right, when she's totally full of shit! And what's the deal with airline food?" Swap the gender depending on your target audience.

In other languages, chairs have a gender, along with other everyday items like scissors and it doesn't especially make logical sense, although you can squint and tell a story as why something is the gender that's been assigned. Thus making the gender of AI simply a matter"that's just how things are".

johnfn•2h ago
Context is king. The problem is that you are the one currently telling Claude how close it is and what to do next. But if you give it the tools to do that itself, it will make a world of difference.

Give Claude a way to iteratively poke at what it created (such as a playwright harness), and screenshot of what you want, and maybe a way to take a screenshot in Playwright and I think you will get much closer. You might even be able to one shot it.

I’ve always wondered what would happen if I gave it a screenshot and told it to iterate until the Playwright screenshot matched the mock screenshot, pixel perfect. I imagine it would go nuts, but after a few hours I think it would likely get it. (Either that or minor font discrepancies and rounding errors would cause it to give up…)

sqircles•2h ago
> The Space Jam website is simple: a single HTML page, absolute positioning for every element...

Absolute positioning wasn't available until CSS2 in 1998. This is just a table with crafty use of align, valign, colspan, and rowspan.

DocTomoe•2h ago
Which would also render differently on every machine, based on browser settings, screen sizes, and available fonts.

Like the web was meant to be. An interpreted hypertext format, not a pixel-perfect brochure for marketing execs.

jeanlucas•1h ago
>Like the web was meant to be.

what?

masswerk•1h ago
Hum, table cells provide the max-width and images a min-with, heights are absolute (with table cells spilling over, as with CCS "overflow-y: visible"), aligns and maybe HSPACE and VSPACE attributes do the rest. As long as images heights exceed the effective line-height and there's no visible text, this should render pixel perfect on any browser then in use. In this case, there's also an absolute width set for the entire table, adding further constraints. Table layouts can be elastic, with constraints or without, but this one should be pretty stable.

(Fun fact, the most amazing layout foot-guns, then: Effective font sizes and line-heights are subject to platform and configuration (e.g., Win vs Mac); Netscape does paragraph spacing at 1.2em, IE at 1em (if this matters, prefer `<br>` over paragraphs); frames dimensions in Netscape are always calculated as integer percentages of window dimensions, even if you provide absolute dimensions in pixels, while IE does what it says on the tin (a rare example), so they will be the same only by chance and effective rounding errors. And, of course, screen gamma is different on Win and Mac, so your colors will always be messed up – aim for a happy medium.)

shortformblog•2h ago
Claude can't properly count the number of characters in a sentence. It's asking a lot to assume it can get pixel perfect.
simonw•2h ago
I wonder if Gemini 3 Pro would do better at this particular test? They're very proud of its spatial awareness and vision abilities.
thenumpaduser•2h ago
We are actually spoiled at this point.
echelon•2h ago
[flagged]
dang•2h ago
Could you please stop posting this sort of indignant-sensational comment? It's not what this site is for, as you know (or should know).

https://news.ycombinator.com/newsguidelines.html

echelon•1h ago
Dang, can you explain how this is indignant or sensational?

Anthropic's leadership and researchers continue to this day to post messages saying engineering will be fully automated. I can go find recent messages on X if you'd like.

This forum is comprised mostly of engineers, who will be the most impacted if their vision of the world pans out.

YC depends on innovation capital to make money. If the means of production are centralized, how does YC make any money at all from engineers? Such a world will be vertically and horizontally integrated, not democratically spread for others to take advantage of.

Now I don't think that's what's going to happen, but that's what the messaging has been and continues to be from Anthropic's leadership, researchers, and ICs.

Why should we support companies like this?

We shouldn't we advocate for open models where any market participants can fully utilize and explore the competitive gradients?

I don't think I'm saying anything controversial here.

Furthermore, if this pans out like it seems it will - a set of three or four AI hyperscalers - we'll also be in the same situation we have today with the big tech hyperscalers.

Due to a lax regulatory environment, these companies put a ceiling on startup exits by funding internal competition, buying competitors, etc. I don't see how the situation will improve in an AI world.

If you're a capitalist, you want competition to be fierce and fair. You don't want concentration of power.

I can see how an Anthropic IC might not like this post, but this should be fairly reasonable for everyone else who would like to see more distribution of power.

DocTomoe•2h ago
I keep wondering ... is this a good benchmark? What is a practical use-case for the skills Claude is supposed to present here? And if the author needs that particular website re-created with pixel-perfect accuracy, woulnd't it me simpler to just to it yourself?

Sure, you can argue this is some sort of modern ACID-Test - but the ACID tests checked for real-world use-cases. This feels more like 'I have this one very specific request, the machine doesn't perfectly fullfill it, so the machine is at fault.'. Complaining from a high pedestal.

I'm more surprised at how close Claude got in its reimagined SpaceJam-site.

sema4hacker•2h ago
> The total payload is under 200KB.

Just out of curiosity, how big was what you considered Claude's best attempt to be?

tehjoker•2h ago
Hmm you note that the problem is the LLM doesn’t have enough image context, but then zoom the image more?

Why not downscale the image and feed it as a second input so that entire planets fit into a patch and instruct it to use the doensampled image for coarse coordinate estimation

960design•2h ago
Claude argued with me about the quadratic equation the other day. It vehemently felt a -c was required whereas a c was the correct answer. I pointed this out showing step by step and it finally agreed. I tried Grok to see if it could get it right. Nope, the exact same response as Claude, but Grok never backed down; even after the step by step explanation of the maths.
kristofferc•2h ago
Can you link to the conversation log?
pbhjpbhj•2h ago
In the generalised solution there is a '-c' term with coefficient '4a'...?

I'm not well at the moment, perhaps your ML model has flu?!!

rickcarlino•1h ago
I look forward to an alternative reality where AI vendors race to have the model with the best Space Jam Bench scores.
jacquesm•1h ago
It beats pelicans by a considerable margin.
computersuck•1h ago
Why not just host a copy from waybackmachine?
Madmallard•1h ago
I wrote a 20,000 line multiplayer battle-arena game in XNA back in 2015 with manually coded physics (so everything is there in the code) and have tried several times with Claude, Gemini, Grok, DeepSeek, and GPT to translate it to JavaScript.

They all fail massively 100% of the time. Even if I break it down into chunks once they get to the chunks that matter the most (i.e. physics, collision detection and resolution, event handling and game logic) they all break down horribly and no amount of prompting back and forth will fix it.

RagnarD•1h ago
Why not just feed it the actual instructions that create the site - the page source code, the HTML, CSS, JS if any?
masswerk•1h ago
This may be an even greater challenge: analysing a table layout and recreating it in CSS with absolute positioning.
computersuck•1h ago
this is just AI brainrot disease

Help, I can't recreate a website with AI! There's no other way, no way I could fix up some HTML code! Believe me, I'm an engineering manager with a computer science degree!

Absolutely disgusting.

derektank•1h ago
I’m pretty confident the author was being sarcastic
victorbuilds•1h ago
Building something similar - using Claude API to generate mini games from text descriptions (https://codorex.com, still pretty rough).

Can confirm: Claude is weirdly good at generating functional game logic from vague prompts, but spatial precision is a constant battle. Anything involving exact pixel positions needs validation/correction layers on top.

The suggestion upthread about having it write its own measurement tools seems promising - haven't tried that approach yet.

victorbuilds•1h ago
Actually, here's a relevant example - had Claude generate an interactive solar system with circular orbits and moons: https://codorex.com/shared/yeABdJWvRHAKqHs2kxpRnZNZPWmqWSu9

Interesting that it handles this fine (functional orbital mechanics, animation) but would probably struggle to recreate the exact pixel positions of the Space Jam layout. Confirms the pattern: good at "make something like X" but bad at "recreate X exactly."

anorwell•1h ago
The article does not say at any point which model was used. This is the most basic important information when talking about the capabilities of a model, and probably belongs in the title.
ErrantX•1h ago
I just feel this is a great example of someone falling into the common trap of treating an LLM like a human.

They are vastly less intelligent than a human and logical leaps that make sense to you make no sense to Claude. It has no concept of aesthetics or of course any vision.

All that said; it got pretty close even with those impediments! (It got worse because the writer tried to force it to act more like a human would)

I think a better approach would be to write a tool to compare screenshots, identity misplaced items and output that as a text finding/failure state. claude will work much better because your dodging the bits that are too interpretive (that humans rock at and LLMs don't)

sallveburrpi•51m ago
> vastly less intelligent than a human

I would more phrase it like that they are a completely alien “intelligence” that cant really be compared to human intelligence

oncallthrow•36m ago
Yes, currently they are both smarter and dumber than humans.
manmal•1h ago
I would put Claude into a loop and let it make screenshots itself, diffing them against the original screenshot, until it has found the right arrangement at the planets‘ starting position (pixel perfect match).
epgui•1h ago
At that point you may as well just do the work yourself.
hooo•1h ago
Why? Just give it access to the playwright mcp server.
manmal•30m ago
I‘m using this, works really well and doesn’t pollute context as much:

https://github.com/steipete/agent-scripts/blob/main/scripts/...

manmal•31m ago
What I‘m describing is probably a few minutes of exploring and writing a good prompt, vs what, 4h of CSS wrangling?
fluidcruft•1h ago
I would try giving it a tool to work with projections and process axis by axis to see if it works better in 1D than 2D. I dunno how clever claude is about signal processing though. There's no noise here so... I mean really it's just template matching without rotation and without noise so... But I doubt claude can do or reason about basic image processing.
neuroelectron•1h ago
My web-dev friend saw the original Space Jam site. I asked him what it would cost to build something like that today. He paused and said:

We can’t. We don’t know how to do it.

Aeolun•1h ago
I think claude could have easily used a script to calculate the positions of the planets exactly here, instead of trying to use the frankly horrible image recognition.
fluidcruft•53m ago
Claude could probably have written a script to do it. I'm hunting assets to confirm.

Edit: yup very easy for claude. Most of the wall time was spent installing cv2 in a venv (uv dropped python 3.14 on claude, which seemed to get stuck compiling numpy or something so I interrupted and told claude to just use python 3.13 instead).

ME: In this directory there are two images: @p-pressbox.gif and @screenshot.png. @p-pressbox.gif is an image that exists somewhere inside @screenshot.png Write a script that finds the precise location

CLAUDE: Looking at the images, I can see that p-pressbox.gif (the spaceship with "PRESS BOX SHUTTLE" text) appears in the upper-left area of screenshot.png. I'll write a Python script using OpenCV to find the precise location.

[ some venv stuff/telling claude to use uv and python 3.13 ]

CLAUDE: The script found the precise location of p-pressbox.gif within screenshot.png:

  | Measurement         | Value           |
  |---------------------|-----------------|
  | Top-left corner     | (87, 110)       |
  | Bottom-right corner | (218, 166)      |
  | Center              | (152, 138)      |
  | Template size       | 131 × 56 pixels |
  | Match confidence    | 97.2%           |

  The script is at /tmp/search/find_image.py and uses OpenCV's template matching with normalized cross-correlation.
ME> As part of the output create a visualization of the @p-pressbox.gif positioned on top of the original screenshot

CLAUDE: Done! The visualization shows a green rectangle highlighting the exact location where p-pressbox.gif was found within the screenshot. The match is at coordinates (87, 110) to (218, 166).

I confirmed the result was correct.

BiteCode_dev•27m ago
Tell claude to put the screenshot as an centered image with the body having the starry background on repeat. Then define the links as boxes over each icons with an old little tech trick called an image map.

Common at the time before flash took over.

wilsmex•26m ago
Well this was interesting. As someone who was actually building similar website in the late 90's I threw this into the Opus 4.5. Note the original author is wrong about the original site however:

"The Space Jam website is simple: a single HTML page, absolute positioning for every element, and a tiling starfield GIF background.".

This is not true, the site is built using tables, not positioning at all, CSS wasn't a thing back then...

Here was its one-shot attempt at building the same type of layout (table based) with a screenshot and assets as input: https://i.imgur.com/fhdOLwP.png

manbash•12m ago
Ah, those days, where you would slice your designs and export them to tables.
manlymuppet•15m ago
Couldn’t you just feed Claude all the raw HTML from the website and have it “decrypt” that?

The entire website is fairly small so this seems feasible.

Usually there’s a big difference between a website’s final code and its source code because of post processing but that seems like a totally solvable Claude problem.

Sure LLMs aren’t great with images, but it’s not like the person who originally wrote the Space Jam website was meticulously messing around with positioning from a reference image to create a circular orbit — they just used the CSS tools they had to create an acceptable result. Claude can do the same.

Perhaps the best method is to re-create, rather than replicate the design.