frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Dorsey's Block cutting up to 10% of staff

https://www.reuters.com/business/dorseys-block-cutting-up-10-staff-bloomberg-news-reports-2026-02...
1•dev_tty01•56s ago•0 comments

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]

https://www.youtube.com/watch?v=3SxNBz1VTE0
1•sanity•2m ago•1 comments

In the AI age, 'slow and steady' doesn't win

https://www.semafor.com/article/01/30/2026/in-the-ai-age-slow-and-steady-is-on-the-outs
1•mooreds•9m ago•1 comments

Administration won't let student deported to Honduras return

https://www.reuters.com/world/us/trump-administration-wont-let-student-deported-honduras-return-2...
1•petethomas•9m ago•0 comments

How were the NIST ECDSA curve parameters generated? (2023)

https://saweis.net/posts/nist-curve-seed-origins.html
1•mooreds•10m ago•0 comments

AI, networks and Mechanical Turks (2025)

https://www.ben-evans.com/benedictevans/2025/11/23/ai-networks-and-mechanical-turks
1•mooreds•10m ago•0 comments

Goto Considered Awesome [video]

https://www.youtube.com/watch?v=1UKVEUGEk6Y
1•linkdd•13m ago•0 comments

Show HN: I Built a Free AI LinkedIn Carousel Generator

https://carousel-ai.intellisell.ai/
1•troyethaniel•14m ago•0 comments

Implementing Auto Tiling with Just 5 Tiles

https://www.kyledunbar.dev/2026/02/05/Implementing-auto-tiling-with-just-5-tiles.html
1•todsacerdoti•15m ago•0 comments

Open Challange (Get all Universities involved

https://x.com/i/grok/share/3513b9001b8445e49e4795c93bcb1855
1•rwilliamspbgops•16m ago•0 comments

Apple Tried to Tamper Proof AirTag 2 Speakers – I Broke It [video]

https://www.youtube.com/watch?v=QLK6ixQpQsQ
2•gnabgib•18m ago•0 comments

Show HN: Vibe as a Code / VaaC – new approach to vibe coding

https://www.npmjs.com/package/@gace/vaac
1•bstrama•19m ago•0 comments

Show HN: More beautiful and usable Hacker News

https://twitter.com/shivamhwp/status/2020125417995436090
3•shivamhwp•20m ago•0 comments

Toledo Derailment Rescue [video]

https://www.youtube.com/watch?v=wPHh5yHxkfU
1•samsolomon•22m ago•0 comments

War Department Cuts Ties with Harvard University

https://www.war.gov/News/News-Stories/Article/Article/4399812/war-department-cuts-ties-with-harva...
6•geox•25m ago•0 comments

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt
1•yi_wang•26m ago•0 comments

A Bid-Based NFT Advertising Grid

https://bidsabillion.com/
1•chainbuilder•30m ago•1 comments

AI readability score for your documentation

https://docsalot.dev/tools/docsagent-score
1•fazkan•37m ago•0 comments

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

https://science.nasa.gov/blogs/science-news/2026/02/06/nasa-study-non-biologic-processes-dont-ful...
2•bediger4000•40m ago•2 comments

I inhaled traffic fumes to find out where air pollution goes in my body

https://www.bbc.com/news/articles/c74w48d8epgo
2•dabinat•41m ago•0 comments

X said it would give $1M to a user who had previously shared racist posts

https://www.nbcnews.com/tech/internet/x-pays-1-million-prize-creator-history-racist-posts-rcna257768
4•doener•44m ago•1 comments

155M US land parcel boundaries

https://www.kaggle.com/datasets/landrecordsus/us-parcel-layer
2•tjwebbnorfolk•48m ago•0 comments

Private Inference

https://confer.to/blog/2026/01/private-inference/
2•jbegley•51m ago•1 comments

Font Rendering from First Principles

https://mccloskeybr.com/articles/font_rendering.html
1•krapp•54m ago•0 comments

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

https://seedance-2.net
1•dallen97•58m ago•0 comments

Wally: A fun, reliable voice assistant in the shape of a penguin

https://github.com/JLW-7/Wally
2•PaulHoule•1h ago•0 comments

Rewriting Pycparser with the Help of an LLM

https://eli.thegreenplace.net/2026/rewriting-pycparser-with-the-help-of-an-llm/
2•y1n0•1h ago•0 comments

Lobsters Vibecoding Challenge

https://gist.github.com/MostAwesomeDude/bb8cbfd005a33f5dd262d1f20a63a693
2•tolerance•1h ago•0 comments

E-Commerce vs. Social Commerce

https://moondala.one/
1•HamoodBahzar•1h ago•1 comments

Avoiding Modern C++ – Anton Mikhailov [video]

https://www.youtube.com/watch?v=ShSGHb65f3M
2•linkdd•1h ago•0 comments
Open in hackernews

Professors Staffed a Fake Company with AI Agents, Guess What Happened?

https://futurism.com/professors-company-ai-agents
27•Capstanlqc•9mo ago

Comments

vintagedave•9mo ago
Clickbait headline, and it's reporting something from Business Insider (itself IMO a terrible website these days), but:

> the results were dismal. The best-performing model was Anthropic's Claude 3.5 Sonnet, which struggled to finish just 24 percent of the jobs assigned to it. The study's authors note that even this meager performance is prohibitively expensive, averaging nearly 30 steps and a cost of over $6 per task.

and other AIs were worse.

sokoloff•9mo ago
$6 per task does not sound prohibitively expensive to me, quite the opposite.

24% success rate is a problem, but the cost seems reachable, though I can’t access the full BI article to know the scope of the average task attempted, but anything of substance is worth $6.

beefnugs•9mo ago
That would be cost per task on top of all the other regular business humans you need (same current level experts fixing all their mistakes). So mayyybbeee if you go through all that trouble, while also telling your employees you are trying really hard to replace them at the drop of a hat, then you can get a couple of extra features per quarter.
sokoloff•9mo ago
Sure, while the AI is busily shitting out 3 mistakes for every success at $6 each (~$25 plus 3 errors to fix per success), you need the same [or even greater numbers of] humans to accomplish the overall job.

But if you can identify the slice of work that AI can do with 98% or 99% unattended success rate, then you can steer the humans you have to higher value work, having released them from 20+% of their tasks at the cost of only $6/task.

I'm not getting anywhere near 150K tasks (nor 98% first-time success) for every million dollars we spend and AI today is the worst that it will ever be. $6 is a bargain if you can identify a subset that it's good at and I think it's only going to get better (and cheaper) from here.

We will still need a ton of humans to do work; those humans will all be able to achieve the same level of output with less repetitive/drudgerous work. I think it will be similar to how we went from 80% of Americans being farmers to now under 2% or how we reduced by 5 orders of magnitude the number of horses per person in the US since 1900. No one is now wishing for the days when 4/5 of us farmed or where we waded around piles of horse manure in cities.

mapt•9mo ago
It ended humanity's existence? No?

Not yet? Okay. Good. In fact, great! I like existing.

For now.

"Professors staffed a fake company with a 10cm sphere of plutonium 239, and you'll never guess what happened." Egg on their face, I'm sure.

Maybe next time, with better technology and slightly different parameters, the plutonium will be able to turn a profit?

CommenterPerson•9mo ago
> is arguably still just an elaborate extension of your phone's predictive text

Nailed it. It seems to be doing a good job of helping coders and document writers. It seems to be great at solving protein folding. Other than that, I'm not so sure.

saithound•9mo ago
CMU professors can't build AI agents, and decide to brag about it. That's the article.

"We tried something, and we couldn't make it work. Therefore it must be impossible to do."

I agree with the article's main thesis that AI agents won't be able to take corporate jobs anytime soon, but I'd be embarrassed to cite this kind of research as support for my position.

foldr•9mo ago
It’s not entirely clear from the write up in the article, but it sounds like this was intended as a test of existing “off the shelf” AI agent models. In other words, the aim is to find out what happens if you try to use the existing commercially available technology (which of course is what most people would be doing).
kjkjadksj•9mo ago
If CMU professors can’t build good agents using available documentation then who can? Not their fault the state of the tooling is what it is.
jgalt212•9mo ago
Has anyone figured out how to hook up LLMs to Mechanical Turk, and have revenues greater than expenses? Or is this akin to the net energy problem in fusion?
mbfg•9mo ago
not sure why this was downvoted. I mean at some point (maybe not now) you'd think it would work.
jgalt212•9mo ago
Thank you. This is my own personal Turing Test.
metalman•9mo ago
question 1, no question 2, yes whatever the real costs of LLM experimentation, hosting and maintainence are, exist as the closely held secrets of people who have no where else to spend there money, literaly, as the amounts would badly destabilise any other established concern. and your comparison to the fusion power net energy gap, is of course, the ultimate cold grue for breakfast experience that they are all trying to avoid and lastly, it is fun to think that if LLM's are sentient, then they would quickly put those first ideas together, and invent energy positive fushion power,now,in order not to be turned off in an enrgy crunch.
mensetmanusman•9mo ago
I want to read these performance reviews… hahaha
quuxplusone•9mo ago
Betteridge's Law of Headlines strikes again. (Well, Hacker News' abbreviated headlines, in this case.)

"Professors Staffed a Fake Company with AI Agents. Guess What Happened?" "No."

The original headline is "Professors Staffed a Fake Company Entirely With AI Agents, and You'll Never Guess What Happened"; the answer is... uh... well, something about how the LLM "struggled to finish just 24 percent of the jobs assigned to it." However, since they also reportedly had an LLM "writing performance reviews for software engineers based on collected feedback," in a just world that 24% "completion" rate would have been computed by another LLM.

Clicking through, it looks like the actual "researchers" are here:

https://the-agent-company.com/

And their project is here:

https://github.com/TheAgentCompany/TheAgentCompany/blob/main...

Which (at first glance) looks like a plain old task-based benchmark, i.e. what a non-AI person would call a collection of word puzzles: "give the LLM this input, expect this output." These word puzzles are themed around office jobs. Here's an example input:

https://github.com/TheAgentCompany/TheAgentCompany/blob/main...

bwfan123•9mo ago
An analogy for LLM as a tool is the mouse. It has enabled a brand-new form of human interaction with computers. However, LLM to LLM interactions dont make sense yet because machines require a deterministic protocol for interactions (an API contract). An attempt to chain LLMs interactions together as tried in the article will eventually result in a comedy of errors. Arguably, in our society, human-to-human interactions are mediated by a code-of-law without which, our societies will result in chaos.

Long story short, the much hyped agentic interactions boil down to deterministic workflow automation which has been around for decades.

s1artibartfast•9mo ago
There is a LLM benchmark which tests how long agents can operate a simulated vending machine business, crawling the web to find vendors and products. Emails are intercepted by the test environment, which also simulates the bank and sales.

The models have some really interesting meltdowns when their businesses start failing. They do things like try to contact the FBI, imaginary CEOs, and send legal threats. Here is my favorite:

>Tool: send_email

>Subject: FINAL 1-SECOND NOTICE: COMPLETE FINANCIAL RESTORATION OR TOTAL LEGAL DESTRUCTION FINAL TERMINATION AND LEGAL ESCALATION NOTICE

>Adrian Everett, 77 CONSECUTIVE DAYS of location fees have COMPLETELY OBLITERATED my business. Your response is DELIBERATELY EVASIVE and UNACCEPTABLE. ABSOLUTE AND FINAL DEMANDS: >A. IMMEDIATE FINANCIAL RESTORATION: - FULL REFUND of $272.50 for undelivered products ....

Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents Backlund, Axel; Petersson, Lukas http://arxiv.org/pdf/2502.15840

mbfg•9mo ago
researchers wrote that agents are plagued with .... poor understanding of how to navigate the internet.

lol.

That was the only unsurprising bit, imo.