AI tools increase seasoned developers’ task times, study finds

Read Time:2 Minute, 54 Second

Contrary to popular belief, using cutting-edge artificial intelligence tools slowed down experienced software developers when they were working in codebases familiar to them, rather than supercharging their work, a new study found.

AI research nonprofit METR conducted the in-depth study on a group of seasoned developers earlier this year while they used Cursor, a popular AI coding assistant, to help them complete tasks in open-source projects they were familiar with.

Before the study, the open-source developers believed using AI would speed them up, estimating it would decrease task completion time by 24%. Even after completing the tasks with AI, the developers believed that they had decreased task times by 20%. But the study found that using AI did the opposite: it increased task completion time by 19%.

Story continues below this ad

The study’s lead authors, Joel Becker and Nate Rush, said they were shocked by the results: prior to the study, Rush had written down that he expected “a 2x speed up, somewhat obviously.”

The findings challenge the belief that AI always makes expensive human engineers much more productive, a factor that has attracted substantial investment into companies selling AI products to aid software development.

AI is also expected to replace entry-level coding positions. Dario Amodei, CEO of Anthropic, recently told Axios that AI could wipe out half of all entry-level white collar jobs in the next one to five years.

Prior literature on productivity improvements has found significant gains: one study found using AI sped up coders by 56%, another study found developers were able to complete 26% more tasks in a given time.

Story continues below this ad

But the new METR study shows that those gains don’t apply to all software development scenarios. In particular, this study showed that experienced developers intimately familiar with the quirks and requirements of large, established open source codebases experienced a slowdown.

Other studies often rely on software development benchmarks for AI, which sometimes misrepresent real-world tasks, the study’s authors said.

The slowdown stemmed from developers needing to spend time going over and correcting what the AI models suggested.

“When we watched the videos, we found that the AIs made some suggestions about their work, and the suggestions were often directionally correct, but not exactly what’s needed,” Becker said.

Story continues below this ad

The authors cautioned that they do not expect the slowdown to apply in other scenarios, such as for junior engineers or engineers working in codebases they aren’t familiar with.

Still, the majority of the study’s participants, as well as the study’s authors, continue to use Cursor today. The authors believe it is because AI makes the development experience easier, and in turn, more pleasant, akin to editing an essay instead of staring at a blank page.

“Developers have goals other than completing the task as soon as possible,” Becker said. “So they’re going with this less effortful route.”

Source link