AI’s Limits Exposed: New Study Finds Machines Struggle With Real Remote Work

Research shows AI agents fail most remote tasks, with top performer automating just 2.5% of freelance work.

The study, called the Remote Labor Index (RLI), represents one of the most detailed attempts so far to measure AI’s performance on practical digital work. It focuses on tasks that mirror real online freelancing jobs rather than theoretical tests or benchmark problems.

Researchers collected 240 completed projects from professional freelancers working through platforms such as Upwork. Each project included the original brief, all input materials, and the final deliverable that a client had accepted.

23 categories of work, including product design, animation, architecture, game development, and data analysis.
More than 6,000 hours of paid labor valued at about $140,000.

Six advanced AI agents were then tested on the same projects. The systems included Manus, Grok 4, Sonnet 4.5, GPT-5, ChatGPT agent, and Gemini 2.5 Pro.

Author's summary: AI agents struggle with remote work, automating only 2.5% of tasks.

Digital Information World — 2025-11-01

AI’s Limits Exposed: New Study Finds Machines Struggle With Real Remote Work

More News