Innovation Insight for AI Software Engineering Agents

1 September 2025 - ID G00830388 - 17 min read

By Philip Walsh, Manjunath Bhat, and 3 more

AI software engineering agents are driving a shift from using interactive code assistants to systems capable of planning and executing tasks semiautonomously. Software engineering leaders who fail to prepare risk creating a costly, fragmented and harder-to-manage development environment that erodes rather than improves productivity.

Overview

Key Findings

The productivity gains from software engineering (SWE) agents risk being offset by significant shifts in organizational burden, particularly around governance, security, platform engineering and operations.
SWE agents are a nascent technology, but they have reached a level of practical utility to drive growing enterprise interest and early-stage experimentation.
The most effective early use cases for SWE agents tend to be tightly scoped tasks with clear, measurable outcomes such as bug fixes, unit test suites, refactoring and API endpoint creation. Higher-complexity scenarios currently achieve less consistent outcomes.
SWE agents are a catalyst for the broader shift toward AI-native software engineering, requiring changes in roles, skills, processes and workflows alongside technology adoption.

Recommendations

Strengthen platform and enablement teams to manage the governance, security and operational demands of SWE agent integration, including the curation of code, documentation and processes that underpin effective context engineering.
Prioritize integration flexibility to enable easier adoption of new agent capabilities. Develop your team’s proficiency in using interoperability standards such as Model Context Protocol (MCP) and agent-to-agent (A2A) interfaces to guide platform and vendor choices.
Start SWE agent adoption with tasks where the required information is readily accessible and the agent has proven capability. Scale to more complex cases gradually, using pilots to manage expectations and build organizational confidence.
Evaluate SWE agents within the broader framework of AI-native software engineering. This should include people, process and workflow impacts, alongside evaluation of the technology itself.

Strategic Planning Assumptions

By 2028, asynchronous SWE agent workflows will improve software engineering team productivity by 30% to 50%, surpassing the 0% to 20% gains from AI code assistants in 2025.

By 2028, engineering organizations without dedicated agent governance and platform teams will find their productivity gains from SWE agents offset by coordination overhead and operational complexity.

Introduction

AI developer tools are evolving from first-generation code assistants toward more autonomous software engineering agents. Unlike integrated development environment (IDE)-based assistants focused on code completion, SWE agents plan and execute multistep workflows, maintain persistent context and interact with repositories, documentation and runtime environments. This shift expands their role from supporting isolated tasks to orchestrating broader portions of the software development life cycle (SDLC).

The 2025 Gartner AI in Software Engineering Survey reflects this evolution, showing steady adoption of agentic capabilities within IDEs: 16% report using Cursor and 4% report using Windsurf. Furthermore, 13% report using Anthropic’s Claude Code (a command line interface [CLI]-based agent) and 4% report using Cognition’s Devin (a remote cloud-based agent), which enable asynchronous, programmable workloads beyond IDE boundaries and support a one-to-many relationship where individual developers can deploy multiple agents.

Yet this evolution brings new challenges. Organizations face a fragmented tool landscape and must weigh productivity gains against rising burdens in governance, security and platform operations. Early use cases such as bug fixes, test generation, refactoring and API creation show the most consistent value, while more complex scenarios remain less reliable. The key question for enterprises is how to adopt SWE agents in ways that maximize utility while containing risk. The answer lies in starting with high-confidence, narrow tasks, strengthening platform and enablement teams and investing in interoperability standards to ensure integration flexibility and future scalability. Ultimately, software engineering leaders should evaluate SWE agents not in isolation, but as part of a broader strategy for moving toward AI-native software engineering, which encompasses changes in roles, skills, processes, teams and workflows (see Innovation Insight on AI-Native Software Engineering).

Description

Definition

AI software engineering agents (SWE agents) are autonomous or semiautonomous software systems designed to support the software development process, from requirements engineering through deployment. They use AI techniques to perceive, plan, take action and achieve goals in the software engineering environment.

Software engineering leaders must understand the core capabilities that define SWE agents to deploy them effectively (see Figure 1).

Figure 1: AI Software Engineering Agent Capabilities

AI software engineering agents have three core capabilities: goal-oriented planning, autonomous task execution and contextual perception. These are surrounded by seven software-engineering-specific capabilities, including tool orchestration, auditability and ecosystem support.

SWE agents are, first and foremost, AI agents, and thus follow the same general pattern as any agent: plan, perceive and act. Foundational capabilities include:

Goal-oriented planning — AI agents accept high-level objectives, devise execution strategies and evaluate relevant trade-offs.
Contextual perception — AI agents observe, interpret and maintain awareness of relevant inputs, system states and environmental signals, including natural-language instructions, workflow context and system artifacts.
Autonomous task execution — AI agents independently perform multistep tasks, engaging in self-directed problem decomposition, solution implementation and continuous task refinement.

Building on these foundational capabilities, SWE agents possess six distinct core capabilities tailored explicitly to software engineering contexts:

Persistent context across multitask workflows — SWE agents maintain persistent context across multiple interconnected (sub)tasks required to achieve a goal. Rather than being limited to specific isolated tasks, such as code reviews or test coverage, these agents track and sustain context across multistep workflows.
SDLC tool orchestration — SWE agents are capable of coordinating activities across repositories, CI/CD systems, build environments and deployment platforms. They are capable of invoking and chaining external tools, allowing cohesive workflows across multiple phases of the SDLC.
Diverse ecosystem support — SWE agents support a wide range of programming languages, surfaces, frameworks and domains, primarily driven by natural language interaction.
Support for multiple product team personas — SWE agents support use cases for product owners, quality engineers, software developers, user experience (UX) designers, architects and DevSecOps engineers. This may include coordinating tasks across members in a team by seeking approvals or sending notifications.
Variety of user interaction modalities — Emerging implementation patterns for SWE agents include:
- Terminal/CLI-based agents focusing on composability, automation and DevOps integration.
- Cloud-based agents running in isolated environments, facilitating asynchronous task delegation.
- IDE-integrated agents enabling synchronous, low-latency interactions with developers.
- API-driven agents offering flexible integration into diverse development toolchains.
Auditability and guardrails — SWE agents provide audit trails enabling interpretation of their reasoning processes. They ensure safe operations through human-in-the-loop controls and scoped permissions.

Architecturally, AI agents typically incorporate perception systems (consuming information from application life cycle management tools, code repositories, CI/CD pipelines, runtime environments and observability tools), reasoning engines (planning solutions and evaluating trade-offs) and execution systems (modifying code and executing commands). Advanced implementations support context injection and tool calling via MCP. Future implementations will support multiagent coordination through standards like A2A, facilitating interoperability across platforms. Agents also vary by autonomy: from approval-required agents ensuring transparency at each step, through semiautonomous modes with periodic human guidance, to fully autonomous operations with minimal intervention.

Benefits and Uses

The potential of SWE agents can be understood through three distinct levels of organizational benefit, each building upon the previous to create compounding improvements in engineering effectiveness (see Figure 2).

Figure 2: Three Levels of Software Engineering Agent Benefits

A three-level pyramid shows the benefits of software engineering agents, ranging from more feasible to more valuable. The base is accelerated task completion, the middle is workflow parallelization, and the peak is an autonomous software development life cycle (SDLC).

Level 1: Accelerated Task Completion

At the foundational level, SWE agents extend beyond traditional code assistants by maintaining context across complex, multifile tasks while applying systematic, goal-oriented approaches to engineering challenges. This represents the most mature and immediately accessible category of benefits for organizations beginning their agent adoption journey. Key use cases include:

Technical debt remediation — SWE agents can systematically traverse codebases to identify and resolve deprecated API usage, update dependency chains and assist with modernizing legacy code patterns. Unlike traditional developer tools that work at a local project level, file-by-file, agents can maintain awareness of cross-system dependencies and can ensure consistent application of updates across related components. This capability proves especially valuable for the routine maintenance work that typically gets deprioritized in favor of feature development.
Quality improvements — SWE agents can assist teams in enforcing coding standards and best practices, including adding error handling, proper logging practices or maintaining documentation standards. This assistance with quality improvement reduces the cognitive load on senior engineers while ensuring junior developers benefit from established best practices.
Test coverage enhancement — SWE agents can identify existing testing patterns and generate comprehensive test suites that include edge cases, proper mocking strategies and integration scenarios. Rather than generating isolated test cases, agents can analyze the broader testing architecture and create tests that align with established patterns while improving overall coverage metrics.
Documentation generation and maintenance — SWE agents can create and maintain technical specifications, API documentation, release notes and architectural overviews while keeping them synchronized with code changes. This capability proves particularly valuable for maintaining consistency across multiple related documentation sources and ensuring that user-facing documentation reflects actual system behavior.
Planning and discovery — SWE agents can traverse and understand codebases, enabling faster project initiation and more informed technical decision making. Agents can analyze system dependencies, identify potential integration points and surface architectural considerations that inform project planning and estimation processes.

Level 2: Asynchronous Workflow Parallelization

The middle tier represents where SWE agents deliver their most significant near-term value proposition: enabling engineering teams to delegate work asynchronously across multiple parallel workstreams.

Leverage SWE Agents to Target Cycle Time Efficiency

AI code assistants only solve part of the productivity equation. While they accelerate individual task completion, they leave untouched the idle time between tasks — when work sits waiting for review, context switching delays progress or engineers get pulled into unplanned urgent issues. Asynchronous agent workflows attack this hidden productivity drain directly by maintaining continuous progress across multiple workstreams. This approach to “cycle time efficiency” — the percentage of cycle time an item of work is active — can dramatically improve overall engineering velocity, moving organizations closer to continuous delivery ideals without requiring them to restructure their existing development workflows or deployment pipelines.

Key benefits include:

Parallel task delegation — SWE agents can impact how teams handle unplanned interruptions and context switching. When urgent bug reports or customer issues arise, engineering leaders can delegate initial investigation and resolution attempts to agents without disrupting ongoing feature development. This capability proves particularly valuable for initial root cause analysis that would otherwise require pulling senior engineers away from strategic work.
Continuous background work — SWE agents enable organizations to maintain momentum on lower-priority but necessary work without diverting human attention from critical features. Agents can work continuously on code review, technical debt remediation, dependency updates and routine maintenance tasks, while human engineers focus on high-value architectural decisions and complex problem solving. This parallel processing capability addresses the chronic challenge of important but nonurgent work falling through the cracks.
Capacity multiplication — SWE agents enable individual engineers to maintain progress across multiple features, branches or projects simultaneously. Agents can work on refactoring tasks in one codebase while the engineer focuses on feature development in another, effectively multiplying individual capacity.

Level 3: Autonomous SDLC

At the apex, SWE agents will enable systemwide coordination and capacity multiplication that approaches an autonomous software development life cycle. To be clear: this remains a largely hypothetical future state. Nevertheless, organizations should view this as a “north star” or ideal strategic endpoint for SWE agents. Key benefits include:

Event-driven software development and delivery — SWE agents create the potential for responsive, automated workflows that react to system events, deployment triggers and operational incidents. Agents can automatically investigate performance regressions, coordinate rollback procedures and maintain system health without human intervention, freeing engineering teams to focus on innovation rather than operational firefighting.
Resource optimization — The ability of SWE agents to handle routine engineering operations at scale allows human engineers to focus exclusively on high-value architectural decisions, product strategy and complex problem solving. This represents a fundamental shift in how organizations think about engineering capacity — not just faster developers, but entirely new categories of work that can operate autonomously.
Coordination overhead reduction — SWE agents can manage routine communication, status updates and cross-team synchronization, enabling organizations to reduce coordination overhead. Agents can maintain project documentation, coordinate dependency management across teams and ensure that routine engineering processes continue without constant human oversight.

The capabilities of SWE agents point toward a future state of autonomous development choreography, in which agents coordinate complex, multisystem changes through event-driven workflows with minimal human oversight. While nascent, this vision of self-orchestrating engineering represents the ultimate strategic value proposition: engineering organizations that can maintain continuous delivery and innovation velocity while dramatically reducing coordination overhead and operational burden on human engineers.

The strategic value of these benefit levels lies not in choosing one approach, but in understanding how they build upon each other to create compound improvements in engineering effectiveness. Organizations that successfully implement base-level capabilities create the foundation for more sophisticated workflow parallelization, which in turn enables the organizational transformation capabilities that represent the ultimate strategic value of SWE agents.

Risks

While SWE agents offer compelling productivity benefits, their implementation introduces a significant organizational burden that can offset these gains if not carefully managed (see Figure 3). The 2025 Gartner AI in Software Engineering survey shows that regulatory or compliance risks, validating the outputs of AI tools for accuracy and undesirable results are the top three challenges regarding implementation and adoption of AI tools in the SDLC. Engineering leaders must understand that realizing value requires strategic resource allocation to address emerging risks across technical, operational and human dimensions.

Figure 3: Software Engineering Agent Risks May Offset Productivity Gains

Two charts compare software engineering agent implementations. In a poorly managed one, rising organizational burdens eventually surpass productivity gains, creating a "risk zone." With a well-managed implementation, gains consistently outpace burdens, creating a "net benefit zone."

Technical Risks

Agents operating at increased velocity amplify existing development risks while creating new patterns of technical debt accumulation:

New “agent debt” — SWE agents can generate “agent debt” (agent-created technical debt), functional but suboptimal code solutions that prioritize working implementations over optimal design. This technical debt can proliferate systematically across codebases before human review catches patterns.
Systematic security blind spots — SWE agents may consistently apply flawed security patterns or miss context-specific requirements, creating vulnerabilities that replicate across multiple implementations faster than traditional code review processes can identify.
Architectural weakness at scale — Beyond security, agents can replicate design flaws in performance, reliability, scalability or maintainability, amplifying risks that emerge from inconsistent human oversight or inadequate architectural guardrails.
Cross-context information synthesis — Agents can inadvertently combine nonsensitive data from multiple repositories, logs and documentation to infer sensitive insights, creating unintended information disclosure risks that exceed traditional data access controls.
AI-on-AI validation loops — As agents generate code at volumes beyond feasible human review, organizations will increasingly rely on AI-based tools for quality and security validation. This creates an AI-on-AI loop where systematic flaws or biases may go undetected, amplifying risks compared to traditional review processes (see Innovation Insight: AI Code Review Tools).

Mitigation strategy: Implement robust automated testing and security scanning pipelines specifically designed for agent-generated code, while establishing clear boundaries for agent data access and cross-system information synthesis.

Disruption to Developer Experience

The shift to agent orchestration fundamentally changes daily engineering work from linear deep focus to continuous multitasking across parallel agent workflows.

Developers orchestrating SWE agents across asynchronous workflows must constantly zigzag between delegating tasks, reviewing agent outputs and providing feedback. This pattern can create cognitive overhead and context switching fatigue that undermines promised productivity gains.

This new collaboration model requires developers to become effective conductors of autonomous systems while maintaining technical oversight and quality control. Many engineers struggle with this transition, particularly those who thrive in traditional deep-work environments.

Mitigation strategy: Implement small-scale proofs of concept with mid- to senior-level engineers experienced in task delegation, starting with one to two engineers orchestrating multiple agents on well-defined maintenance work before expanding to broader team structures and more complex use cases.

Increased Platform Engineering and Ops Burdens

Successfully scaling SWE agent workflows will require extensive preparation and ongoing governance that creates new operational overhead. Writing the application has never been the hard part of software delivery — keeping it running for the next 15 years is — and faster code generation only adds to that long-term challenge. Organizations must continuously curate and maintain environments optimized for agent consumption and action across codebases, documentation, tooling and system integrations:

Knowledge management — Organizations must systematically transform their knowledge assets and development environments to be agent-ready. This involves structuring codebases, documentation and tooling so agents can effectively interpret and act upon them (see Innovation Insight for Context Engineering).
Token cost management — Token consumption costs can escalate rapidly, with individual developers potentially consuming hundreds or thousands of dollars in API costs during intensive agent usage. While token costs are systematically decreasing as model architectures and inference techniques advance, developers tend to gravitate toward frontier models with advanced reasoning capabilities that remain expensive. See Optimize AI Cost and Reliability Using AI Gateways and Model Routers
Agent operations and platform teams — Successfully scaling SWE agent usage will require establishing “AgentOps,” new operational roles dedicated to managing agent environments, permissions, monitoring and cost controls. These platform teams represent new organizational capabilities focused on agent life cycle management, performance optimization and integration maintenance. See Software Engineering Foundations for the AI-Native Era.
Governance and control — Organizations must establish comprehensive frameworks for managing distributed agent activities, including audit trails, risk management protocols and quality assurance processes. Unlike traditional development governance, agent governance must account for autonomous decision making and the potential for rapid, systematic changes across multiple systems. See A Checklist for Managing Risk Across Six Pillars of AI Governance.

Mitigation strategy: Establish dedicated agent operations teams with clear governance frameworks, implement comprehensive cost monitoring and controls and systematically optimize organizational knowledge assets for agent consumption before scaling deployment.

Adoption Rate

SWE agents remain in the early adoption phase. The 2025 Gartner AI in Software Engineering Survey shows a clear hierarchy in adoption patterns that reflects the maturity and accessibility of different agent technologies. Thirteen percent of respondents report they regularly use Anthropic Claude Code within their software engineering functions, while 4% report having adopted Cognition’s Devin.

However, these figures mask rapid growth trajectories. Since launching Claude 4 models in May 2025, Anthropic reports 300% growth in Claude Code’s active user base, with run-rate revenue expanding 5.5x, indicating that early adopters are significantly scaling their usage.¹

IDE-integrated agents demonstrate substantially higher adoption rates, benefiting from their natural evolution from established AI code assistants. These platforms enable both synchronous agentic workflows within the IDE and background agents for asynchronous task execution. Organizations should expect adoption to accelerate as IDE integrations mature and stand-alone agents improve their workflow integration capabilities.

According to the 2025 Gartner AI in Software Engineering survey:

87% report using GitHub Copilot (including its “agent mode”)
16% report using Cursor
4% report using Windsurf

Note that developers can use more than one tool; hence, the percentages do not add up to 100.

Recommendations

Evaluate SWE agents within the broader framework of AI-native software engineering. This should include people, process and workflow impacts, alongside evaluation of the technology itself.
Prioritize integration flexibility to enable composable development environments and easier adoption of new agent capabilities. Track emerging interoperability standards such as Model Context Protocol and agent-to-agent interfaces to guide platform and vendor choices.
Start SWE agent adoption with tasks where the required information is readily accessible and the agent has proven capability. Scale to more complex cases gradually, using pilots to manage expectations and build organizational confidence.
Strengthen platform and enablement teams to manage the governance, security and operational demands of SWE agent integration, including the curation of code, documentation and processes that underpin effective context engineering.

Representative Providers

Aider
Amazon Kiro
Anthropic Claude Code
Atlassian Rovo Dev
Cline
CodeGPT
Cognition
Cursor
Github Copilot Coding Agent
Google Gemini CLI
Goose
Ona (formerly Gitpod)
OpenAI Codex
Sourcegraph Amp
Warp

Evidence

2025 Gartner AI in Software Engineering Survey. This study was conducted to explore the adoption of AI within software engineering functions, focusing on two key areas: the use of AI tools (e.g., AI code assistants, AI code agents) throughout the software engineering life cycle (SDLC); and the development of AI-powered solutions (or AI engineering) within software engineering functions, along with their contribution to business outcomes. The research was conducted online from 29 April through 25 June 2025 among 299 respondents from North America (n = 150), EMEA (n = 104) and Asia/Pacific (n = 45). Quotas were established for company sizes and for industries to ensure a good representation across the sample. Organizations were required to be either piloting or using AI tools in SDLC for less than four years, and either piloting or having built AI solutions in their software engineering functions. Respondents included both leaders and individual contributors from software engineering functions, each with at least one year of tenure at their current organization. All respondents were involved in decision making or directly engaged in using AI tools or building AI solutions within their software engineering functions. Disclaimer: The results of this survey do not represent global findings or the market as a whole, but reflect the sentiments of the respondents and companies surveyed.

¹ Claude Code User Base Grows as Anthropic Launches Enterprise Analytics Dashboard, The New Stack.

Innovation Insight for AI Software Engineering Agents

Overview

Key Findings

Recommendations

Strategic Planning Assumptions

Introduction

Description

Definition

Benefits and Uses

Level 1: Accelerated Task Completion

Level 2: Asynchronous Workflow Parallelization

Level 3: Autonomous SDLC

Risks

Technical Risks

Disruption to Developer Experience

Increased Platform Engineering and Ops Burdens

Adoption Rate

Recommendations

Representative Providers

Evidence

More on This Topic