And before some smart aleck says you can be creative on these types of optimization problems: not in two hours, it’s far too risky vs regurgitating some standard set of tried and true algos.
Good. That should be the minimum requirement.
Not another Next.js web app take home project.
It's a take-home test, which means some people will spend more than a couple of hours on it to get the answer really good. They would have gone after those people in particular.
You're both right and wrong. You're right in the sense that the sort of creativity the task is looking for isn't really possible in two hours. That's something that takes a lot of time and effort over years to be able to do. You're wrong because that's exactly the point. Being able to solve the problem takes experience. Literally. It's having tackled these sorts of problems over and over in the past until you can draw on that understanding and knowledge reasonably quickly. The test is meant to filter out people who can't do it.
I also think it's possible to interpret the README as saying humans can't do better than the optimizations that Claude does when Claude spends two hours of compute time, regardless of how long the human takes. It's not clear though. Maybe Claude didn't write the README.
[1] https://en.wikipedia.org/wiki/Demoscene [2] https://en.wikipedia.org/wiki/Code_golf
It even uses Chrome tracing tools for profiling, which is pretty cool: https://github.com/anthropics/original_performance_takehome/...
The machine is fake and simulated: https://github.com/anthropics/original_performance_takehome/...
But presumably similar principles apply.
As a take home assignment though I would have failed as I would have probably taken 2 hours to just sketch out ideas and more on my tablet while reading the code before even changing it.
koolba•1h ago
The README only gives numbers without any information on what you’re supposed to do or how you are rated.
glalonde•1h ago
vermilingua•48m ago
nice_byte•40m ago
being cryptic and poorly specified is part of the assignment
just like real code
in fact, it's _still_ better documented an self contained than most of the problems you'd usually encounter in the wild. pulling on a thread to end up with a clear picture of what needs to be accomplished is like 90% of the job very often.
avaer•17m ago
IMO the assignment('s purpose) could be improved by making the code significantly worse. Then you're testing the important stuff (dealing with ambiguity) that the AI can't do so well. Probably the reason they didn't do that is because it would make evaluation harder + more costly.