TL;DR: Upwork benchmarked human+AI collaboration on 322 real, simple freelance jobs (e.g., lead gen, simple coding). Adding a human expert to the loop increased the job completion rate by up to 70% over agents working alone.
How: They used real, paid jobs ($10-$200 range) and had expert freelancers create detailed rubrics to score the deliverables. The "human-in-the-loop" model involved the expert evaluating the agent's work, providing feedback, and guiding it to a final, client-ready state. The dataset is dynamic and based on actual client demand, not static tasks.
hiby007•1h ago
How: They used real, paid jobs ($10-$200 range) and had expert freelancers create detailed rubrics to score the deliverables. The "human-in-the-loop" model involved the expert evaluating the agent's work, providing feedback, and guiding it to a final, client-ready state. The dataset is dynamic and based on actual client demand, not static tasks.