Has anyone had any success with evaluating the impact of using Generative AI tools such as GitHub's Copilot on the productivity or performance impact on developers? I see a lot of qualitative discussions about how developers say they are more productive, but how are you measuring that impact?
Research: quantifying GitHub Copilot’s impact on developer productivity and happiness - The GitHub Blog
Result:
The group that used GitHub Copilot had a higher rate of completing the task (78%, compared to 70% in the group without Copilot).
The striking difference was that developers who used GitHub Copilot completed the task significantly faster–55% faster than the developers who didn’t use GitHub Copilot. Specifically, the developers using GitHub Copilot took on average 1 hour and 11 minutes to complete the task, while the developers who didn’t use GitHub Copilot took on average 2 hours and 41 minutes. These results are statistically significant (P=.0017) and the 95% confidence interval for the percentage speed gain is [21%, 89%].
Thanks for sharing. Interesting study 🤔
🤗 you are welcome.
Thanks Romano. Yes, I had seen that study (really the only one I found that had actual metrics). It's a start, but really that's a fairly artificial example, since in real life we would never set a bunch of our developers up to all code the same thing. I was hoping that someone had done a live before and after measurement of developer productivity. The search continues . . .
Thanks Matthew. How was the 5% gain calculated? We're really looking to see if there is a way to actual measure the impact short of doing a survey and asking the devs if they thought they were more productive.
We measure the cycle time (from feature start until merge) across all of our development teams.
Content you might like
Modbus (widely used protocol in industrial automation and control systems)13%
OPC UA (protocol for machine-to-machine communication that is designed for use in industrial automation and control systems)48%
MQTT (lightweight messaging protocol that is designed for use in low-bandwidth, high-latency networks)21%
DDS (real-time publish-subscribe communication protocol that is designed for use in distributed systems)10%
AMQP (messaging protocol that is designed for use in distributed systems)2%
LoRaWAN (long-range radio-wide area network used for IoT, smart cities, and industrial applications)1%
Proprietary protocols (please, comment)4%
We are not doing regression testing10%
25% manual, 75% automated50%
50% manual, 50% automated27%
100% manual, 0% automated8%
Don't know2%