AI coding agents feel like magic... right up until they collide with production code. For teams maintaining legacy systems, these agents often hallucinate APIs, run off on tangents, and shatter trust faster than an unreviewed hotfix at 5pm. Ignoring the past won't save us, because new code becomes old!
We can do better, and we will. In this session we'll go over emerging strategies for improving the accuracy of coding agents on real codebases, benchmarks such as SWE-bench that evaluate our progress, and their limitations. Expect to walk away with actionable techniques and a renewed respect for code that came before us and the challenges ahead.
Key Takeaways
Why LLMs have more trouble with existing code
What we can do about it
How we measure progress
Ray Myers is a legacy code expert with 18 years of Software Engineering experience across four industries. His recent work explores the delicate intersection of AI and maintainability. He co-hosts the Empathy In Tech podcast and publishes guidance on the Craft vs Cruft YouTube channel with influences from DevOps to Taoism.