frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: We made a hiring challenge because Claude can 1-shot our interviews

https://www.atomsnotelectrons.com
3•jgru•10h ago
We're Tutor Intelligence, a robotics company building generally capable robot workers for American industry. We've been thinking about what technical evaluation should look like in a world where AI agents can 1-shot our hardest hour-long coding interviews, and this is one of our first experiments.

The challenge: command 5 robots in a 60x40 warehouse to fulfill 1000 orders. Your score is the number of timesteps to complete everything. Robots can move, pick items from pallets, dock to pallets (so they move together), and fulfill orders at the edge of the warehouse. Simple rules, but the optimization problem has surprising depth (it's NP-hard in about 10 different ways) with lots of room for strategy and creativity.

This is actually a simplified version of a real problem we work on. Warehouse coordination is one of those domains where the gap between a naive solution and a good one is enormous, and there are many valid approaches.

We built a web visualizer so you can see your solution play out, and a leaderboard if you want to submit. AI agent use is encouraged (probably necessary). So far only my cofounder and I have submitted, so we genuinely have no idea how good solutions can get.

Sharing this early because we'd love feedback on the problem design. And yes, we're hiring (that's why we made it): 70 people, Series A, based in Boston, founded out of MIT. But mostly just curious if others find this problem as interesting as we do.

Comments

dnw•10h ago
When I read the title I thought you made a challenge Claude couldn't solve but that's not what you are doing. You are taking a pragmatic approach to the world we live in. I like it.

- It would be good to put what you are planning to learn from this interview process. - Looks like submission is only a text file. Why not ask for chat transcript? - Also, would be useful to let people know what happens after submission/selection.

jgru•9h ago
Submission actually takes you to a full featured visualizer! So you can go back and forth on it and see your robots run. You can then choose to submit to the leaderboard if you'd like (otherwise nothing goes to our servers) which will collect your email. No plans to do anything with those yet besides serve the leaderboard.

Hoping to get a better understanding of whether success on this correlates with things that we think are important for AI-enabled software engineering success. I think this is largely a question of the problem depth, and how much does a solution still need to be driven by that person's creativity, vs the model suggesting the next obvious idea.

theamk•10h ago
Sounds like a take-home challenge, but with a twist that you actually expect people to use AI, so that AI is not "cheating"?

Are you worried about the usual take-home challenge problem of user getting outside help? Either friends solving the problem for them, or paid help doing the same?

jgru•9h ago
Our hope is that beyond being a take-home challenge people can get competitive and it can turn into more of a passion project. I think the problem has sufficient depth that this could be the case! Hence the leaderboard and visualization tooling.

Around cheating, we've never been too worried about this. The real cost is wasted time later down the pipeline, but we can always tell then whether the person lines up with the work in the takehome.

Some first thoughts about live immersive basketball

https://sixcolors.com/post/2026/01/some-first-thoughts-about-live-immersive-basketball/
1•coloneltcb•2m ago•0 comments

Where's the $100k iPhone?

https://boydkane.com/essays/100k-iphone
1•zdw•6m ago•0 comments

MIT Non-AI License

1•dumindunuwan•8m ago•1 comments

Show HN: Understand the Picture of the Day

https://picture.learntosolveit.com
1•orsenthil•8m ago•0 comments

Haraltd – A cross-platform Bluetooth daemon with a JSON-based RPC

https://github.com/bluetuith-org/haraltd
1•darkhz•9m ago•0 comments

The Stick in the Stream

https://randsinrepose.com/archives/the-stick-in-the-stream/
1•zdw•11m ago•0 comments

MAKERphone 2: first modular DIY phone, no soldering

https://circuitmess.com/products/makerphone-2-0
1•nateb2022•11m ago•0 comments

Sodium-ion battery cells near lithium-ion cost parity, set to get cheaper

https://www.ess-news.com/2026/01/09/sodium-ion-battery-cells-already-near-lithium-ion-cost-parity...
1•toomuchtodo•11m ago•1 comments

OpenAI to Buy Pinterest? Strategic Analysis

https://nekuda.substack.com/p/openai-to-buy-pinterest-heres-what
1•gmays•13m ago•0 comments

Vajra BM25 is a fast BM25 implementation in Python

https://twitter.com/aiexplorations/status/2009846407881212136
1•aiexplorations•16m ago•1 comments

Show HN: A website to save moments that remind you of someone

https://thisremindedme.com/
1•Winggo•16m ago•0 comments

Google and chatbot startup Character move to settle teen suicide lawsuits

https://www.washingtonpost.com/technology/2026/01/07/google-character-settle-lawsuits-suicide/
1•1vuio0pswjnm7•19m ago•0 comments

Agent skills: what can go wrong?

https://github.com/pors/skill-audit
1•pors•19m ago•0 comments

You probably don't need Oh My Zsh

https://rushter.com/blog/zsh-shell/
8•fla•20m ago•1 comments

Fix Your Robots.txt or Your Site Disappears from Google

https://www.alanwsmith.com/en/37/wa/jz/s1/
2•qingcharles•21m ago•1 comments

Show HN: VoiceBrainDump – voice-first idea capture, single HTML file, offline

https://voicebraindump.app/
1•digi_wares•24m ago•0 comments

Show HN: Focus timer that turns hours into assets

https://seton.run/
1•keplerjst•28m ago•0 comments

Kazakhstan Launches First Institute of Transport Sciences and Technologies

https://qazinform.com/news/kazakhstan-launches-first-institute-of-transport-sciences-and-technolo...
1•Bolat14•33m ago•0 comments

AI Flatters with Fidelity

https://lucent.substack.com/p/ai-flatters-with-fidelity
2•surprisetalk•34m ago•0 comments

Lidify: Self-hosted, on-demand audio streaming platform like Spotify

https://github.com/Chevron7Locked/lidify
1•thunderbong•37m ago•0 comments

Show HN: Rank up your local business on Google Maps

https://www.mapclimb.com/
3•bagusfarisa•37m ago•0 comments

The world has too much oil – Will companies want Venezuela's?

https://www.npr.org/2026/01/07/nx-s1-5668491/venezuela-oil-global-markets
4•geox•43m ago•0 comments

Elon Musk's Grok Has Friends in High Places: US Patent Office chief AI officer

https://jacobin.com/2026/01/grok-hayes-artificial-intelligence-deepfakes
2•wahnfrieden•44m ago•0 comments

Checks and Balances Are Dead

https://rall.com/2026/01/08/checks-and-balances-are-dead
6•SanjayMehta•44m ago•0 comments

M2.1: Multilingual and Multi-Task Coding with Strong Generalization

https://www.minimaxi.com/news/m21-multilingual-and-multi-task-coding-with-strong-general
1•gmays•47m ago•0 comments

Character.ai and Google agree to settle lawsuits over teen suicides

https://www.ft.com/content/ac518567-d901-4fae-86a3-eab54b12a81d
1•1vuio0pswjnm7•57m ago•0 comments

Demystifying Evals for AI Agents

https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents
1•vinhnx•58m ago•0 comments

Best Practices for Coding with Agents

https://cursor.com/blog/agent-best-practices
1•vinhnx•58m ago•1 comments

Microsoft revealed as company behind controversial data center proposal in MI

https://www.cnbc.com/2026/01/07/microsoft-behind-controversial-data-center-in-michigan-township.html
4•1vuio0pswjnm7•1h ago•0 comments

A man powers his home for 8 years using 1,000 recycled laptop batteries

https://scienceclock.com/a-man-powers-his-home-for-8-years-using-1000-recycled-laptop-batteries/
3•ashishgupta2209•1h ago•1 comments