Logo

THE END OF

LEGACY APPS?

Birgitta Böckeler
Distinguished Engineer, ThoughtWorks
Back to podcasts

Is Your Team Ready for AI-Driven Modernization?

with Birgitta Böckeler

Transcript

Chapters

Introduction
[00:01:05]
Building with agents
[00:05:28]
Legacy migration
[00:11:33]
Relevant context
[00:17:28]
Reliable validation
[00:24:45]
Success expectation
[00:28:24]
Changes in legacy modernisation
[00:32:35]
The humans in the loop
[00:39:13]
Delivery management
[00:46:57]
The future of devs
[00:49:27]
Outro
[00:55:45]

In this episode

In this episode, host Simon Maple and Birgitta Böckeler, Distinguished Engineer at ThoughtWorks, explore the transformative role of AI in the software development process. They delve into how AI can accelerate workflows, aid in legacy modernization, and shift developers from authors to orchestrators. Key takeaways include adopting techniques like retrieval-augmented generation (RAG) over specific tools, implementing guardrails, and leveraging AI to build disciplined, reliable workflows across the software lifecycle.

AI isn’t just a feature you ship—it’s fast becoming the fabric of how software itself gets built. In this episode, host Simon Maple sits down with Birgitta Böckeler, Distinguished Engineer at ThoughtWorks and long-time contributor to the ThoughtWorks Technology Radar, to unpack how developers can use generative AI and agentic tooling to accelerate day-to-day workflows—and even tackle hard legacy modernization problems where the source code or original context is missing.

From “AI in Products” to “AI for Building Software”

Birgitta draws a clear boundary that frames the entire conversation: there’s AI inside your product, and there’s AI inside your software development process. While the former affects teams building AI-powered features, the latter touches every engineer in the organization. Her domain expertise is the latter—infusing the software delivery lifecycle itself with AI capabilities.

This distinction matters for governance and investment. If you’re embedding AI in your workflow (as opposed to shipping it to end users), your priorities are developer experience, velocity, reliability, and fit with existing SDLC controls. Interestingly, the organizational plumbing for one often benefits the other: if your product org needs self-hosted models, guardrails, and evals, those same capabilities can harden internal AI tooling for your engineers.

Through the ThoughtWorks Technology Radar—published twice a year as a snapshot of what practitioners are actually using—Birgitta sees a dramatic shift: over half of recent entries were generative-AI-related. Crucially, while no single vendor tool earned an “Adopt,” one technique did: retrieval-augmented generation (RAG). It’s a telling signal—technique over tool, and a push to operationalize patterns that reduce hallucinations and improve relevance.

The Rise of Agentic Coding: Bigger Tasks, New Capabilities

Coding assistants leveled up in late 2023 and early 2024. Beyond inline completions and single-file edits, modern tools can now:

  • Edit multiple files cohesively
  • Execute terminal commands autonomously
  • Run and react to tests in tight loops
  • Use external tools and services via protocols like MCP (Model Context Protocol), from browsing to hitting a test database

That expansion transforms the size and shape of work you can safely hand off. It’s no longer “finish this function”; it’s “create an endpoint, update the schema, adjust the tests, and fix the linter,” and the agent can iteratively act on feedback from the build and test environment.

Two modes emerged. First, interactive “pair” workflows where developers steer the agent step-by-step inside the IDE. Second, background/autonomous runs kicked off to tackle larger tasks with minimal intervention. The “vibe coding” moment spotlighted the latter—the idea that you can delegate a high-level intent and let the agent chase it down through multiple steps while you supervise.

Developer as Orchestrator: New Practices for Control and Quality

As agents become truly agentic, the developer’s role shifts from author to orchestrator. Paradoxically, the AI can move so fast that humans fall behind—unless they establish new practices to control scope and ensure quality.

Practical strategies discussed include:

  • Intent-first planning: Start sessions with crisp acceptance criteria and explicit constraints. Define the “done” state, the test surface area, the allowable tech stack, and anything the agent must not change.
  • Task decomposition for agents: Even with autonomous runs, break the work into bounded milestones (e.g., schema migration, endpoint addition, integration test repair). Smaller scopes produce more reliable loops and clearer diffs.
  • Parallelization with review gates: For asynchronous agent runs, fire off parallel candidates and then review outcomes side-by-side. Pick the best result or merge the strongest pieces. This pattern shines for exploratory refactors or multi-file changes where different approaches might be viable.

Guardrails should be practical and observable: run tests in a sandbox; enforce lint and type checks; mandate explicit diffs; and treat terminal access as a power tool with a chaperone. If your organization already uses evals and policy checks for product-facing AI, extend them to agent outputs: define evals for correctness, latency, and safety in the developer workflow, and add routine spot checks to maintain trust.

Legacy Modernization with AI: When You Don’t Own the Code You’re Changing

Modernization is a perfect use case to design purposeful agentic workflows. It’s not a one-off; it’s a repeatable pipeline. Birgitta describes scenarios where teams face “black box” systems—sometimes even without access to source code due to past vendor arrangements—and need to migrate frameworks, update platforms, or remediate security gaps.

Treat the effort like a structured, AI-enabled program:

  • Standardize a workflow: codify a repeatable series of steps—inventory components, infer behavior from interfaces, map dependencies, propose migration plans, and generate changes incrementally.
  • Build reusable prompts and playbooks: the same prompt templates can guide agents through 50+ components, enforcing consistency in analysis, code edits, and documentation.
  • Equip agents with the right tools: via MCP or equivalent, give controlled access to build commands, test runners, HTTP clients, and documentation sources. Let the agent discover behavior by running safe probes or reading artifacts, then propose changes you can review.

The key difference from day-to-day coding is repeatability at scale. You aren’t just “using an AI assistant”; you’re assembling an agentic system tailored to your migration pattern. That mindset helps you design for traceability, idempotency, and auditability—critical when touching legacy systems that nobody fully understands anymore.

Techniques Over Tools: What to Adopt Now (and How)

The Radar’s “Adopt” signal for RAG is a pragmatic north star. For developer workflows, RAG can ground agents in your codebase, architecture docs, ADRs, and runbooks, dramatically reducing hallucinations. Start by curating authoritative corpora, tagging content for relevance, and enforcing retrieval in prompts so the model cites and uses the right sources.

On the tooling front, avoid betting the farm on a single vendor. Look for tools that:

  • Support multi-file edits with traceable diffs
  • Execute tests and shell commands with logs you can audit
  • Integrate via MCP or similar to add capabilities safely
  • Allow configuration of model choice, temperature, and policies
  • Produce artifacts you can version (plans, change sets, explanations)

Operationally, put evaluation and feedback loops in place. Define a few golden tasks per repo to measure agent reliability. Track metrics like percent changes passing CI on first try, review effort per agent PR, average rework, and defect escape rate. Use those signals to tune prompts, adjust scopes, or swap models. Finally, train developers on the new craft—how to frame intents, supervise runs, and keep outputs aligned with coding standards.

Key Takeaways

  • Separate concerns: distinguish AI-in-product from AI-for-building-software. Invest in guardrails, evals, and model hosting decisions that serve both where it makes sense.
  • Prefer techniques over tools: RAG is ready to adopt for developer workflows—ground agents in the docs and code that matter.
  • Use agentic power safely: enable multi-file edits, terminal access, and tests—but in sandboxed, observable loops with clear acceptance criteria and diffs.
  • Orchestrate, don’t abdicate: decompose tasks, parallelize agent runs when useful, and put review gates in place. Developers are the managers of these processes.
  • Scale modernization with systems: for legacy migrations—especially when code context is thin—build reusable prompts, standardized workflows, and agent toolchains (via MCP) that you can run across many components.
  • Measure and iterate: define success metrics (first-pass CI, review time, rework) and use evals to continuously improve prompts, models, and workflows.

This episode is a field guide for engineering leaders and hands-on developers who want to use AI not as a flashy add-on, but as a disciplined, reliable co-worker across the software lifecycle—from greenfield coding to the gnarliest legacy modernization.