Has anyone had any success with evaluating the impact of using Generative AI tools such as GitHub's Copilot on the productivity or performance impact on developers? I see a lot of qualitative discussions about how developers say they are more productive, but how are you measuring that impact?
Sort by:
There are multiple ways to check on the productivity improvements
Baseline Velocity / throughput of past X months which was without any code companion. Track the velocity / throughput post the code companion usage - to get a steady state view try for atleast 3 - 6 sprints : during this period ensure the developer is encourage to use the tool. The telemetry reports will provide how many are actively using the tool, how many prompt are being done / accepted etc. Make corrections based on this data report - if teams need more training provide so, if they need additional time to get used to using the tool provide so, we also have devised mechanisms to check lines of code generated - human vs machine generated (check latest announcements from GHCP for these aspects)
Once in regular use for build, you see the trend moving upwards & various quantitative & qualitative metrics of regular development will be able to show the outcomes - code quality, velocity, time to market etc
I use GitHub Copilot almost daily, and I am an experienced Java developer. It actually makes me more productive when creating some patterns and refactoring, unit testing. I am exploring further at this point. What I can say is probably increasing my output 2x fold. The caveat is I still check the code generated for validity. I bet that senior devs will use the tool more efficiently due to the fact they know what to ask, based on their experience.
Any specific use cases that you might be able to share with using GitHub CoPilot?
Yes, these tools are very effective and efficient.
From my perspective, we proved the value of Generative AI (Development co-pilots) by focusing on two key areas. First, we measured our team's velocity, establishing a clear baseline before introducing the tool and seeing a sustained increase in story points completed per sprint afterward.
Second, we went beyond just counting pull requests. We tracked PR cycle time—the time from creation to merge—and saw a significant drop. For us, that was the key insight: we weren't just writing more code, we were delivering and merging it much faster without compromising quality.