Back to articlesAs AI coding agents mature, metrics follow a familiar enterprise path

22 Dec 20257 minute read

Paul Sawers

Freelance tech writer at Tessl, former TechCrunch senior writer covering startups and open source

AI Coding Tools

Code Generation

GitHub & Git

AI-Native Development

Developer Experience

Table of Contents

Metrics emerge as coding agents hit production

Cloud observability: A blueprint for agent metrics

Back to articles

As AI coding agents mature, metrics follow a familiar enterprise path

22 Dec 20257 minute read

How widely are AI coding tools being used within teams, and what kinds of things are developers doing with them? As coding agents move beyond experimentation and into day-to-day development work, those questions are becoming increasingly relevant for organizations trying to understand how these systems fit into their workflows.

That shift also helps explain why vendors behind some of the most widely adopted coding agents are beginning to add clearer visibility into how their systems operate at scale.

Metrics emerge as coding agents hit production

In early December, GitHub introduced new dashboards that aggregate metrics related toCopilot’s code generation activity, giving organizations a clearer view into how the tool is being used across teams. This surfaces high-level signals such as the volume of code changed with AI assistance, distinguishes between changes initiated directly by developers and those made automatically by Copilot’s agent features, and breaks that activity down by model and programming language.

Tracking Copilot code generation metrics

The company followed up last week with new APIs that allow organizations to track Copilot usage programmatically, making it easier to integrate adoption and usage data into internal reporting, governance, or compliance workflows.

Elsewhere, Continue, the startup behind the eponymous open source coding assistant, took a similar step by introducing metrics for its cloud-based agents inside its Mission Control interface. While Continue had previously exposed basic usage data, the upgrade centralized those signals into a dedicated control plane for teams running agents at scale.

These updates reflect a growing effort to make cloud-based coding agents easier for teams to evaluate, particularly as they move into production use. As more engineering work is delegated to automated systems, organizations want clearer ways to understand what those agents are doing and how their output fits into day-to-day development.

“Engineering teams adopt automation when we trust the output, and trust requires visibility,” Continue’s senior developer advocate Bekah Hawrot Weigel said. “As more organizations move toward cloud-hosted AI agents and Continuous AI, the same questions kept coming up.”

These questions centered on what cloud-based agents are actually accomplishing over time, which automated workflows are producing tangible engineering output, and how teams should evaluate the return on investment as these systems become part of the development pipeline.

“Metrics makes all of this visible,” she continued.

Cloud observability: A blueprint for agent metrics

It’s worth noting that GitHub’s dashboards and APIs surface activity generated by Copilot inside GitHub’s own environment, while Continue’s metrics reflect the behavior of agents running through its cloud platform. No surprises there.

But there is historical precedence for what could come next, with this kind of first-party visibility typically appearing as platforms reach a certain level of operational maturity. In cloud infrastructure, providers such as Amazon Web Services, Microsoft Azure, and Google Cloud began exposing usage, reliability, and cost data through tools like CloudWatch, Azure Monitor, and Google Cloud Monitoring once their services became core production dependencies for enterprises. Managed machine learning platforms followed a similar trajectory, evidenced through the likes of SageMaker, Vertex AI, and Azure Machine Learning. In each case, metrics emerged alongside enterprise expectations around governance, reporting, and accountability.

Agent metrics fit naturally into that same arc. As AI-driven systems become more embedded in day-to-day development, organizations want clearer insight into how those systems are behaving and how widely they are being used. Dashboards and APIs provide a practical way to surface that information and connect it to existing administrative and reporting processes.

Over time, cloud observability expanded beyond single-provider views as organizations adopted multi-cloud and hybrid environments. Tools such as Datadog, Prometheus, and OpenTelemetry emerged to aggregate metrics, logs, and traces across clouds and on-prem infrastructure, rather than tying visibility to a single provider’s runtime.

A similar dynamic may emerge around AI agents as teams deploy multiple agents and automation tools in parallel. Rather than instrumenting execution inside a specific agent runtime, higher-level approaches focus on the points where agent activity becomes visible to the rest of the engineering system.

Open source projects such as Git-ai reflect that approach by operating at the repository and workflow level. Instead of tracking how an agent reasons internally or which tools it invokes, Git-ai focuses on what ultimately changes in a codebase: which commits were influenced by AI, how automated contributions move through pull requests, and how agent-driven changes evolve over time. By anchoring observability at the Git layer, it becomes possible to analyze AI activity across different agents and tools using a shared frame of reference.

In that sense, vendor-provided metrics and repository-level observability address different questions. Platform dashboards help organizations understand how specific agents behave inside managed environments, while higher-level tooling looks at outcomes that persist beyond any single runtime. As agent adoption broadens, both perspectives are likely to coexist, serving complementary roles rather than collapsing into a single view.