
CUT TECH DEBT
WITH REWRITE
AUTOMATION
What If Fixing Code Wasn’t Your Job Anymore?
Also available on
Transcript
[00:00:21] Simon: Hello, and welcome to another episode of the AI Native Dev, and joining me today is Jonathan Schneider. Who is the CEO and co-founder of Moderne, and Moderne, um, is a, a company that is really focused around the analysis of code and also the refactoring of code bases. So, welcome Jonathan. How are you?
[00:00:42] Jonathan: Doing well, thank you, Simon.
[00:00:43] Jonathan: Good, glad to be here.
[00:00:45] Simon: Oh, absolutely. Pleasure to have you. And whereabouts are you? Whereabouts are you calling in from today, Jonathan?
[00:00:48] Jonathan: I am in Lovely Miami. Which is, was one of the best moves I made a few years ago coming from the west coast to Miami.
[00:00:58] Simon: Oh wow. Over on the San Francisco Bay area style.
[00:01:00] Jonathan: The company was actually founded in Seattle, which itself is an interesting accidental story. But, you know, founded in 2020. The first person we hired was in Seattle, so I just up and moved to Seattle, and we worked in parks on a bench for the first little while. Um, well, um, but then after a couple years, kind of moved over here to Miami, where,
[00:01:23] Jonathan: it's much better for time zone and weather as you can imagine.
[00:01:26] Simon: Oh, absolutely. Yeah. The connection to the, to to to Europe and other [00:01:30] places. I think from a time zone perspective,
Jonathan: Much easier.
Simon: Very much more convenient. And I relate very much to the, to the Seattle weather being, being in the UK.
Jonathan: Of course.
[00:01:37] Simon: Yeah. Um, so yeah, I kind of mentioned Moderne, you know, it deals with the problems that developers have about refactoring, about rewriting existing code. Tell us a little bit about, from your point of view, what was the reason you created Moderne? What were the biggest problems that you were trying to solve at the time, and how old is Moderne as well?
[00:01:59] Jonathan: The company itself is nearing five years now, but the technology open rewrite that goes behind, it actually goes back almost 10. I was working for Netflix engineering tools. And in some ways somewhat responsible for shepherding the organization through large scale changes at the time, like Java six to seven and trying to get off in a logging library and, and these sort of things.
[00:02:20] Jonathan: And while, you know, I spent a lot of effort and a lot of my team spent effort on trying to provide things like early internal developer portals, like basically dashboards that show people where they are relative to where we want them to be, we found that it resulted in approximately, no action towards the objective.
[00:02:42] Jonathan: And so just interviewing those teams, those product teams, and saying, what would it take for me to move you from one thing to another? They would say, well, you know, do it for me, otherwise I've got something else to do. Right? I've got, and it's true, they had a ton of feature pressure on them, so these kinds of [00:03:00] efforts were always.
[00:03:01] Jonathan: Back, back burner, right? Not, not top of mind if they weren't required. So, uh, so the framework started then to try to literally do it for them, you know, try to provide ways of automating this change. I had gone onto the spring team, founded a project called Micrometer, so completely unrelated. Uh, started working on a product called Spinnaker, which also open source out of Netflix, um, for advanced continuous delivery.
[00:03:25] Jonathan: And our co-founder, Olga Kundzich, and I were working with a lot of the large banks and retailers and so forth, trying to build advanced continuous delivery concepts. And, uh, you know, they'd say, "Talk to me in a year when I'm done moving Spring Boot one to two," or "this to that," you know, and so it’s just kind of like we got pulled into this by seeing how repeatable and ubiquitous this problem really was.
[00:03:50] Simon: Yeah. And, and the value in terms of the value of people, you know, doing a lot of the rewrites and I guess things like that. Um, what, what would you, when you're talking to developers and developers like, you know, I've got a ton of things that I need to get, I need to get done, what would you provide them as like, this is the reason why this is important, and why you should, you know, consider this high on your list of, of things to deal with.
[00:04:15] Jonathan: I think luckily it already is high on the list and we hear in various different ways. Um, this is roughly 30 to 40% of just total engineering time. It's just keeping the lights on. You're using an open source third party thing, it [00:04:30] makes a change to its API, like you have to continually restitch the application or it ceases to function.
[00:04:35] Jonathan: So, uh, it's just a reality of software development and the more net new code that we write, the harder that problem becomes. Really.
[00:04:45] Simon: Yeah. And it's one of those things that, I guess it's considered almost like technical debt. Some people kinda want new features from a framework, which is only available in a new, right, new version, but it's just general good hygiene, good software hygiene to, to maybe not bleeding, bleeding edge, but very often stay high up with the, uh, you know, nearer the, uh, the, the, the latest tech.
[00:05:08] Jonathan: One of the more, you know, evolved ways of thinking about this, because you mentioned technical debt, actually comes from Shelter Insurance, you know, kind of a midsize property and casualty insurer in the mid US. They actually tag technical debt on their backlog differently from maintenance, and I think that's a really subtle distinction where, you know, we used to think of technical debt as something where I took a shortcut and I'm gonna have to pay that down later.
[00:05:35] Jonathan: But it's really, you know, I chose to do that. And I, I really identify with the developer position right now, which is that I could make all the best architectural choices, the best language choices, the best library choices. I could be absolutely perfect right now, today on July 15th. And, you know, six months from now, that thing has shifted.
[00:05:58] Jonathan: So [00:06:00] it's, it is really more like maintenance that you have to do on a car or a house. Uh, if you don't do it, you can't expect the thing to continue to function.
[00:06:09] Simon: Yeah, no, absolutely. In the era of AI, I guess, I guess there's a number of tools that, um, with Cure LLM, um, can kind of like do some of this.
[00:06:24] Simon: Um, they can understand and, and, and somewhat analyze, um, code bases. They can make changes. They can, they can do a number of things that even, somewhat reasonable at, completely changing language from one to the other. Right. What would you say, what would you say, I guess, from the, from the pure AI point of view, are the, the advantages and maybe some of the things that they're not so good at when thinking about, any kind of, uh, maintenance or framework changes, keeping up to date, that kind of, that kind of thing?
[00:06:59] Jonathan: Yeah. I think broadly I would consider AI based or pure AI based solutions to be concentrated around authorship experiences. And by authorship, I mean I've got one repository open in the IDE or I'm, you know, I've got a CLI based tool, like a clog code pointed at a particular repository and I'm asking to do a thing.
[00:07:22] Jonathan: And that thing might be a feature development. It could be, it could of course be like you say, language to language, whatever the case might be. [00:07:30] Um, but there's gonna be, you know, there's gonna be a degree of time it spends on that one repository. Maintenance type activities or, you know, modernization type activities tend to be cross-cutting.
[00:07:43] Jonathan: And at most, I'd say, you know, even midsize customers of ours, we're dealing with hundreds, thousands, sometimes tens of thousands of repositories. So that, that workflow of like, you know, uh, CD to a directory, say, Claude, ask you to do something. Commit. Call the next one. CD into that directory, Claude, commit. Uh, just doesn't really scale to the problem.
[00:08:12] Simon: Yeah, yeah. 'Cause I guess if you have an enterprise scale app and you wanna make a change to a frame or something like that, you probably have a vast number of files that need to be adapted. How good would you say AI tools in general would be at increasing their context windows to be able to take in a number of, you know, more projects, more files?
[00:08:40] Simon: Is the, is the limit then the, the context window itself? Uh, does the LLM you find get a little bit more confused the more context you throw at it? Or is it kind of just the experience of, well, how do I even pass all of this if this is spread across hundreds or thousands of repos. What would you say are the, are the limiting factors?
[00:09:01] Jonathan: Right. Yeah, first of all, you know, increasing context window does come with a tradeoff, and that's that there's this inverse relationship between context window and the attention mechanism that's fundamental to LLM. So you would expect, you know, as context window increases, for there to be more confusion, more
[00:09:22] Jonathan: just the less focus, just as you know, I, I hate to anthropomorphize these things, but the same for us. You know, you throw a book at somebody, say, read the whole book and I'm gonna quiz you on it. You're going to have a lot less, you know, targeted, uh, feedback on that than if I ask you a question about a very specific part.
[00:09:38] Jonathan: So, um, that's, that's a fundamental trade off, I don't think is, is really going away. Um, the other thing is that, you know, even when it's focused on a narrow, uh, you know, narrow part of the code. There's, it's still ultimately a probabilistic system. So while the changes can be fantastic, I use these things like crazy.
[00:10:00] Jonathan: If you need to apply the same change across, again, thousands or tens of thousands of repositories, you really don't want to scale the human review of those changes to the number of changes that are being made. So at the end, what you really want is the actor to be deterministic. But building that machine or that cookie cutter that you can then go use to stamp out, you know, low variability cookies across the code base itself can be, uh, developed [00:10:30] with, with the assistance of an LLM.
[00:10:32] Simon: Yeah. Interesting. So where, where would you say the kind of solution lives then, do you think, you know, is, is it completely non LLM, is it somewhat between an LLM and and and more traditional development?
[00:10:44] Jonathan: It's really, it's really a mixture. So what, you know, I started working on 10 years ago with OpenRewrite was really a refactoring, uh, engine.
[00:10:54] Jonathan: And it's a, it's a program so it has a deterministic output every time. It's, you know, has a unit testing framework to it and so forth to, you know, to basically cover all the variations that are inherent in code. Uh, but it used to be that you would encounter problems like say I'm trying to move from one framework version to another
[00:11:15] Jonathan: and I have to write out all these recipes. So for something like a Spring Boot 3 migration, for example, Spring Boot 3.4 migration, there's, uh, almost 3,400 recipes in the catalog right now to just deal with that migration
[00:11:34] Simon: And tell, and tell us what a, what a recipe is. Maybe we'll go a little bit deeper in OpenRewrite.
[00:11:40] Simon: Now, what's a, what's the function of a recipe?
[00:11:42] Jonathan: So a recipe is a program that makes a particular code change. That could be a dependency change. It could be a property change, it could be, you know, this API change from this to that, you know, and so if you imagine just kind of looking through the release notes of a change, uh, or an upgrade, each of the [00:12:00] steps that's in that migration guide would be itself a distinct recipe that's operating in a larger composite recipe to accomplish that thing.
[00:12:09] Simon: Gotcha. Okay. So, if there's, I guess there's a couple of steps there. If you, if you feel like you're going from version X to version X plus one, you need to first of all identify what are the recipes that are, that you need to think about, and are those instances of that code that you're actually making changes to that, that, you know, does that trigger certain recipes that you actually need to perform an action, I presume.
[00:12:32] Jonathan: That's right.
[00:12:33] Jonathan: That's right. And so it, so you know, historically for us it's been, we have thousands of open source recipes. Uh, you know, if the problem statement that you have happens to fit well to the catalog that we have, great, you can kind of ready to go get value right now. If there's some custom change you want to make, um, that's, that's gonna involve writing new recipes.
[00:12:59] Jonathan: And so, um, the perceived cost of writing new recipes used to be, well, I have to learn how to write the, you know, with the framework and I have to, um, you know, I just have to spend time doing it. And whether it was worth writing recipes or not, um, was, there was like a diminishing return to it. It really depended on how many changes that you had to make.
[00:13:20] Jonathan: And was it 60? Was it 60,000? You know, and is it worth it or not? And what I'm seeing, I think with LLMs right now is that the cost to write net new custom recipes is approaching zero. So, as one example, just a few days ago, I was talking with a banking engineering executive that actually literally described this problem to me.
[00:13:42] Jonathan: The, you know, they said, I'm trying to move from on-prem infrastructure to container-based infrastructure. Key problem is, right now, a lot of these applications are writing files to log or logs to files. Instead, they have to be written to SIS out because we have a Splunk agent that's connected to these containers and so.
[00:14:01] Jonathan: That will scoop up SIS out and send it on to Splunk. I literally just kind of transcribed what he was saying in just a paragraph and said, “Go plan the recipes, go write them.” And in the back of my mind I thought, well, I know you know, been around software long enough. I know that this is just gonna be a lot of logging configuration changes, like you're gonna have to change a log back configuration.
[00:14:23] Jonathan: You have to change a Log4J configuration. But what are all the variations of this that I'm gonna have to accommodate? Um, and sure enough, like a, a clog code would just go out and make a plan of 6 or 10 or 15 steps and just start knocking out the recipes. Yeah. Um, because OpenRewrite has a very declarative framework.
[00:14:42] Jonathan: It's like, you know, you assert before, after texting the code. It's hard for the model to cheat at the test. You know, it just writes this test before, after, and then it works hard to make that test pass. Because the test itself is somewhat inflexible. So it's interesting [00:15:00] that, you know, the, the, as we think about like writing frameworks like this, the more declarative we make our, our tests, um, the better the outcome is in the, in the main source code.
[00:15:12] Simon: So would you, would you have the LLM, write the tests as well as run them as well, or?
[00:15:20] Jonathan: Yeah. So the LLM came up with 6 or 10 steps, whatever the case might be. And its first move then is going to say this is what log back configuration looks like before and after. And this is Log4J configuration before and after.
[00:15:32] Jonathan: And so it writes out all the tests and then it starts working on the main source code of the recipes. Um, you know, working until those tests pass essentially. Yeah. So it was about 20 minutes from me transcribing this message, um, to, I had the first six functional recipes. Um, we deployed that to that tenant and ran it on nearly 10,000 repositories.
[00:15:58] Jonathan: And here's all the applications that would have to change, and how many of 'em use this versus that. And you know, then you can kind of, now it's now it becomes a feedback cycle of do I like the change? Is there anything missing? Is there, you know, and you, and you just feed that back into the LLM. “I didn't like this change, go change this.” “I didn't like this change, go change this.” And it iterates on the recipe further.
[00:16:22] Simon: It's interesting 'cause I guess there are, there are gonna be very often some, some minor specifics where actually you might want to use something slightly different. Is that a manual review or is there, are there ways in which, you know, you've used other technology to, to more be a judge for those kinds of reviews?
[00:16:40] Jonathan: It depends on the type of change. So for us, and this is gonna sound strange, but we have a recipe, actually an OpenRewrite recipe, that itself is something we call compilation verification, which we, uh, will run some other set of recipes and then we'll just verify that the compiler would pass on that file after the change.
[00:17:01] Jonathan: Uh, that actually doesn't require the whole build to run. So there's some special sauce to how that particular recipe works. But that itself is a judge. So you know, I can basically, work through the release notes, making all the changes and see at the end whether I've missed something or I've just done something wrong, and then that becomes this kind of cycle that the LLM one can feed off of itself.
[00:17:25] Jonathan: There's of course gonna be a lot of changes, especially configuration type changes that are not verifiable with just compilation or even running unit tests in many cases.
[00:17:34] Simon: Yeah, yeah, absolutely. In my mind, I'm thinking, this sounds great. But then once you do this across scale, across a ton of, um, different repositories and projects, um, does this, does this then start creating pull requests in every single project?
[00:17:53] Simon: And then, you know, who, who then goes about merging all of those poor requests in [00:18:00] more in unison than, than, than anything? What's the logistics, I guess, of actually then pushing this into, into more of a production build.
[00:18:08] Jonathan: This is such a great question and, and I indeed, I think most people that start thinking about large-scale change, the most natural thing for us to think of as developers is, pull requests or mass pull requests.
[00:18:18] Simon: Mm-hmm.
[00:18:19] Jonathan: Um, we, I think we've identified over the years is the room for something we call pull based change and room for something like, you actually used the right word, push base change like this. In general, my observation about the social dynamics of mass pull requests is that. When a product team receives pull requests from some central team, they view it kind of like unwelcome advice coming from an in-law.
[00:18:45] Jonathan: They're just looking for a reason to reject it. I mean, you could imagine you're at that, like, holiday table and your, you know, mother-in-law's like, uh, you know, you got too many pounds on you right now. I think you could do something about that. You're gonna be like, Hey, you know, like, I'm busy, you know, like it's, you know, I'm getting older.
[00:19:01] Jonathan: You know, you're trying to, you're trying to make excuses. You're trying to. However, you know, if you look at yourself in the mirror and you think, you know, it's time, I'm ready to start, you know, hitting the gym, you're just, just more naturally likely to take that advice. Yeah. And so we really focus a lot on how do we put the problem in front of a developer in the kind of IDP sort of way, like, here's where you're at relative to an objective, but it's, you know, it's, it's meant to be informative, and here's the button.
[00:19:32] Jonathan: At a moment and time of your convenience, this rescue's ready for you and we got you 99% of the way there, 95% of the way there, whatever the case might be, do it. Um, and so you, what you'll see is individual product teams doing pull requests to their subset of the business and that has a much higher acceptance rate than essentially issued mass pure.
[00:19:54] Simon: And, and when the pull requests are created, is there typically no work that needs to be done or, or will, is there a kind of like a. Um, you know, a certain weight that comes with every single pull request whereby development teams will need to, at the minimum, I guess, do some kind of a review, but also maybe make changes.
[00:20:11] Simon: What's the fixed cost, I guess?
[00:20:15] Jonathan: It very much depends on the change. So something like a Log4Shell event, um, is, is, you know, we're just vulnerability. High, high impact, you know, we need to get it done now, short timetable. That's actually something that's good for push-based change, that's gonna be top down driven, somebody masses use PRs to everyone.
[00:20:35] Jonathan: There's somebody kind of like with the clipboard, making sure everybody's done. But really, the review of that is pretty, pretty simple. Um, it's gonna make the same kind of change everywhere. Once you see one of them, you just kind of mass commit 'em. You don't look at every individual change. Uh, so that's, that's one of those things that, that gets through quickly.
[00:20:55] Jonathan: We have other kinds of changes like. I think about dependency, vulnerability, [00:21:00] repair, um, and bumping both direct and transitive dependencies whose vulnerability fixes are only a patch version away from the minor release I'm already on. Those are things, let's just push through. Right? Yeah. And, you know, almost automate the push through of that.
[00:21:15] Jonathan: However, if the fix is a minor version away from the patch, or the minor I'm already on, eh, you know, now it's a little more situational. Like, you know, Jackson makes minor release changes that the app may not be functional. Yeah. So you've actually got the same recipe running in two different configurations, one of which is fully automated, and one of which is maybe a little less frequent.
[00:21:37] Jonathan: Maybe we do it once a week and it requires a little bit of human review. Yeah. And that's, and so that's okay. You know, that's, we, we just do that a little bit differently.
[00:21:45] Simon: Yeah. It was funny actually 'cause when you, when you were talking about the capabilities there, my mind immediately did think security, security issues.
[00:21:53] Simon: A developer gets a, uh, I mean, my background in Snyk of course, whereby, you know, developers are told there's a vulnerability at this level and it's transitive maybe there's a version of the direct you need to upgrade in order to consume that fix. Um, from a developer's point of view, it's almost a distraction.
[00:22:11] Simon: And the more you can do to ease that, that ability to say, look, to get security off your back, or to get to, you know, eliminate this backlog of issues, we've come 99% of the way to fixing this for you with a level of confidence that, you know, you can, you can [00:22:30] feel good that this isn't gonna break anything.
[00:22:32] Simon: It's been tested and those types of things that, that, that seems like a really super valuable, super valuable use case.
[00:22:40] Jonathan: It's one of the pillars. Yeah, so I think application modernization, security, vulnerability, repair, those are two key pillars and ones that we've known about for a long time. Maybe to transition to the third one, and this is the one that's a little more surprising, I think is, is this large scale impact analysis.
[00:22:58] Jonathan: And so when we, when we first, when I first started OpenRewrite, it was all about making a code change. I hadn't even imagined the, you know, kind of large scale search or impact analysis use case yet. And so we, we came to understand this, even I think it, uh, one or two years into Moderne's existence that,
[00:23:16] Jonathan: a recipe could, in addition to making a code change, just mark something as found. You know, I found this thing right here, so maybe then I'm doing a search for a particular use of a API and I'm gonna find all of those and, um, and I'm able to use the richness that's in that lossless semantic tree. It's not just the abstracts and syntax, but it's everything,
[00:23:37] Jonathan: the compiler knows all the transitive dependencies. So I really can accurately identify every single use case of a particular API. And then there was one step beyond that, which was, well, it's great looking at all the individual search results, but I'm trying to get a holistic view of what these use, like these call sites look like across, again, [00:24:00] thousands of repositories.
[00:24:01] Jonathan: So, at that point, uh, OpenRewrite accrued a, a, a piece of functionality called data tables, which is this recipes can emit a row, uh, of data with, uh, columns according to a schema the recipe’s choosing. And if you imagine this, you know, recipes are running on many different repositories and each of them is contributing rows to the same aggregated table.
[00:24:23] Jonathan: Then when I'm looking for, say, like an API like that the end result is a table CSV or an Excel file or whatever, which is just the, the inventory of every single, uh, occurrence of this and what repository existed in and what business unit it was in. And so recipes more and more, even if they were making changes, started producing these data tables as well.
[00:24:44] Jonathan: Just 'cause that was useful. Then we turn the table to, or turn the, turn the, uh, clock to, you know, 2024 and tool function calling comes about with large language models and we suddenly recognize that we have an, our possession or the community's possession here, thousands of these recipes that are producing data tables of various sorts.
[00:25:08] Jonathan: And if we improve the description text on those recipes, we can basically bolt all those recipes to a model as tools and say, Hey, when you have a question about the code at large, not just the code you're looking at right now, ask me. Just ask me if there's a recipe or series of recipes that would help you answer that question.
[00:25:28] Jonathan: And so a model's able to read the [00:25:30] recipe description and its option descriptions and data tables and what they mean, select one, run it, uh, get the output and then aggregate that, pre aggregate that in some way to, to surface some result. This has been the sort of, uh, big C change, I think in our, uh, the last, you know, eight, eight or nine months at this point.
[00:25:55] Simon: That sounds, that sounds really powerful. So, you essentially have a mapping then between kind of effectively an LLM, an LLM tool, directly to a recipe that can exist in a project or repo, uh, for, for the, the code in that repo. So then if I were to ask LLM, if I was to ask questions of the, uh, of the code or of your, of your project, the LLM will have deep insights into where certain things are being used, how they can be changed.
[00:26:25] Simon: Uh, as, as well, right. You expose, this is effectively exposed by an MCP server, is that right? That's right.
[00:26:32] Jonathan: Yeah. So it's both, it's, you know, we kind of expose a direct chat experience to say, just ask a question about the code again, not just one repository, potentially at one of our largest customers, it's nearly 5 billion lines of source code under management.
[00:26:46] Jonathan: So that's the context we're dealing with is like 5 billion lines of source code. Uh. You could just ask this question, but you can then also expose that as an MCP server. And the key use case there is when I'm in a Claude code or something like [00:27:00] that, say I'm, I'm like, I had an example last week, I'm in a microservice
[00:27:04] Jonathan: I want to add a middleware cache to this microservice. I know I use middleware cache as in a bunch of other, uh, services and you know, and I want to use the same sort of pattern I use everywhere else. So, when I say, you know, add a middleware cache, please make it consistent with the rest of my code base.
[00:27:22] Jonathan: Well, you know, Claude's training set doesn't know what's consistent in the rest of my code base, um, nor does looking at that one repository, which lacks a middleware cache now you know it, they can't infer that from the code as it is. And so that's an example of where Claude then reaches out to that MCP service and says, go find me all the places where, you know, this, this type of middleware cache is used.
[00:27:46] Jonathan: And, and give me the examples. And so it'll then pull examples, one at a time from different repositories and use that to inform it, how to develop that, that new code over here. Really interesting.
[00:27:59] Simon: Yeah, absolutely. And from the agentic point of view where you kinda like, have, you know, agents, agents kinda like asking these questions, thinking about it more, and
[00:28:08] Simon: having this available to those agents is, it's gonna, it's gonna provide, you know, more valid and correct answers as it as it, yeah, as it creates.
[00:28:21] Simon: What, what about if we was to look at more kind of like open source? So obviously there's gonna be a number of things which, we'll, which we'll be consuming from open source as well as our internal, [00:28:30] um, our, our internal projects and, and, and repos.
[00:28:34] Simon: Do you know, is there support available for, for, you know, being able to analyze. And, and, and look at other, other projects, uh, in and around the open as well.
[00:28:43] Jonathan: Yeah. So for we, for our customers, we'll do what we call mass ingestion. We'll ingest their whole code base in the form of these loss of semantic tree artifacts.
[00:28:53] Jonathan: And so that's, that's really the same data model that open source, OpenRewrite uses, but serialized a disk. And so these MCP servers and so forth are able to operate on vast business units rather than just kind of like going through and parsing each one, one at a time. That same thing that we're doing for private code, we've done for close to 4 billion lines now of open source code.
[00:29:15] Jonathan: So, um, we run that service just at moderne.io as a kind of a free service to anybody. Um, just, you know, it's just available, uh, via GraphQL, API and so forth. So, um, so yes, in, in this case, um. The, uh, model's able to always look at proprietary code and open source code to the extent that we know about it, um, uh, you know, equivalently.
[00:29:43] Simon: Sounds good. And, and, and going forward, I guess thinking, thinking further out into the distance where whereby, you know, prompts turn into specs and things like that, what, what's I guess your, your more longer term vision of, of, of how. Um, as AI [00:30:00] becomes more capable, as people will be able to, uh, lean more into AI, how do you see that working with, uh, with things like OpenRewrite?
[00:30:11] Jonathan: I think the key value here for us has always been just data. There’s data about code that's, you know, that's not just at the syntax tree level or the text level, but everything the compiler knows and that's unfortunately a very difficult problem to produce because you have to side inject yourself into the compiler to get that data out.
[00:30:31] Jonathan: And how even to invoke the compiler for an arbitrary, uh, enterprise repository is a very layered problem, I would say. So, um, I don't think the value of that data, essentially like data lake of deep, like code data is gonna go away. Um, the kinds of ways we, uh, apply it are gonna continue to change. And this example here of, um, you know, allowing say a coding agent to look up the way code is used elsewhere.
[00:31:04] Jonathan: Just one new use case I wouldn't have imagined a year or two ago. Uh, but, um, as for spec driven development right now, I think, um, that day feels like it's, it's here, right? It's already, it's already a reality. It's definitely how I use coding agents. I always ask them to write out a plan, um, to make that plan, to reify it on disk so it's [00:31:30] not just in its memory.
[00:31:31] Jonathan: Um, to continue to edit that plan as it moves along. Um, and, uh, you know, more recently I've started to think, well sure, put the plan on disk, but when I'm doing something new and something cool, I have it write the plan to disk. I get submodule clone our documentation into that repository and I say, if there's room to update the documentation with this, go ahead and do that too.
[00:31:58] Jonathan: And then the last thing is to add, to have it writing a blog post as we're working on this feature together. So it's really doing three different activities, updating documentation, writing that blog post, that's gonna explain our work together and also, you know, maintaining a detailed plan about, uh, about what we're gonna do and what we've done.
[00:32:20] Simon: Yeah. Yeah. And it seems to me almost like, uh, the, the, you know, the code generation still needs to happen, and it seems like you're injected at that stage whereby when the LLM needs to make decisions about what to build, what to write, how to make changes, how to upgrade versions and things like that.
[00:32:36] Simon: That's where it's automatically injected and it's, and it's, and it's using Moderne or OpenRewrite there at the source.
[00:32:44] Simon: Yeah. Awesome. Um, any, any next steps in terms of, uh, uh, you know, what's, what's next? Uh, what are the biggest challenges right now for you and what's next?
[00:32:54] Jonathan: For me, I think that the big responsibility that's on our shoulders is to expand that lake.
[00:32:59] Jonathan: [00:33:00] Um, so it's, it's, it's always adding new languages, um, really deeply integrating with their tooling and their compilers and just extracting as much density of information as we can out of them. Um, because, and it, so one of the really interesting and surprising results of the last couple years has been as we've added new languages like JavaScript and Python and C#, we discovered that the loss of Semantic Tree model for those is highly overlapping with the Java original loss of Semantic Tree to the point where we actually have them extend the base JLST that's formed
[00:33:40] Jonathan: our, uh, Java capability to this point. What that leads to then is that some recipes we originally wrote for Java automatically work on C# and Python and JavaScript in a really surprising way. So, um, think about a, um, a search recipe called find SQL, which is just looking for string literals that have SQL in them, or binary concatenations of literals or, um, when we wrote that originally for Java, because JavaScript, Python, C# extend from J and used the same J literal construct.
[00:34:09] Jonathan: That recipe actually works there as well. So every time we add a new language, it's like, it's like flipping on a light switch on another corner of the world that we didn't look at before. And um, and to me that's the, the, a big part of our, our roadmap is just to continue to turn the lights on more and more places.
[00:34:28] Simon: Yeah. Awesome. [00:34:30] Uh, for people who wanted to, uh, wanted to try it out, uh, obviously you mentioned OpenRewrite is open source so people can have a play with that. And, uh, and Moderne?
[00:34:39] Jonathan: Yeah, absolutely. Uh, you know, our site Moderne has, uh, of course references to the OpenRewrite docs as well, along with a, a number of other materials.
[00:34:49] Jonathan: And, uh, if you want to go straight to OpenRewrite.
[00:34:53] Simon: Awesome. Wonderful. Well, Jonathan, it's been a real pleasure, actually, very insightful. Thank you for, again, thank you for going with us on some of these problems. Um, yeah, feel free to, uh, feel free to jump to some of those, uh, some of those links for those, uh, for those who are listening.
[00:35:08] Simon: And, uh, we really appreciate your time, Jonathan.
[00:35:11] Jonathan: Thank you.
[00:35:12] Simon: Awesome. Thanks for listening, everyone, and, uh, tune into the next episode.
Chapters
In this episode
Jonathan Schneider, co-founder and CEO of Moderne, joins Simon Maple to share how real engineering teams are using Moderne’s rewrite engine to reduce technical debt at the source, and drive org-wide transformation.
On the docket:
• the hard limit LLMs can’t scale past
• why OpenRewrite uses a declarative framework
• the real challenge: accessing the compiler’s truth
Introduction to Moderne
In this episode of AI Native Dev, Jonathan Schneider, CEO and co-founder of Moderne, joined Simon to discuss code maintenance, large-scale refactoring, and the evolving intersection between LLMs and deterministic code transformation. Moderne, built on the OpenRewrite refactoring engine, targets the growing burden of maintaining enterprise-scale codebases and keeping them up to date with modern frameworks and standards.
Origins and Motivation
Moderne’s roots trace back to Schneider’s time at Netflix, where traditional developer dashboards failed to drive code modernization. Developers consistently deprioritized maintenance in favor of feature development. This led to the realization that tooling needed to “do it for them.” OpenRewrite was created to provide deterministic, automatable transformations that eliminate technical debt without developer intervention.
The Reality of Software Maintenance
Jonathan emphasizes that maintenance isn't just about repaying technical debt—it’s continuous upkeep required to ensure that applications remain functional as dependencies evolve. Approximately 30–40% of engineering effort is spent on this type of "code hygiene." With thousands of repositories and constant changes to APIs and frameworks, keeping codebases modern has become a scaling problem.
AI's Role and Its Limitations
While LLMs excel in authoring experiences on single repositories, they struggle with scale, determinism, and consistency across thousands of codebases. Larger context windows reduce attention precision, and LLMs introduce probabilistic variability in critical operations. However, LLMs can assist in generating OpenRewrite recipes by understanding before/after code states and iteratively refining code transformations.
OpenRewrite and Recipes
OpenRewrite operates on deterministic “recipes”—modular programs that make atomic code changes like modifying dependencies or migrating framework APIs. For example, a Spring Boot 3.4 migration may involve over 3,000 recipes. Recipes are composable, testable, and increasingly AI-assisted. LLMs can generate these recipes by understanding high-level intent and translating it into precise transformations.
Scaling Code Transformation
Rather than mass push-based pull requests, Moderne emphasizes developer-initiated pull-based changes. Product teams are more receptive when they opt-in at their convenience. High-urgency issues like vulnerabilities (e.g., Log4Shell) justify push-based approaches, but most modernization is better accepted when surfaced contextually within developer workflows.
Automation, Validation, and Feedback Loops
Recipes can verify compilation after transformation and emit data tables that support large-scale impact analysis. These tables enable LLMs and agents to reason across repositories, answering questions like “Where is this middleware cache pattern used?” with specific examples from billions of lines of code. This transforms OpenRewrite into a deep knowledge base for agents.
LLM Integration and MCP Servers
OpenRewrite recipes are exposed to LLMs via tool calling and MCP (Model Context Protocol) servers, enabling agents to query proprietary and open-source code at scale. This unlocks agentic workflows where models suggest changes based on usage patterns from across a company’s codebase, not just the active file.
Looking Forward
Jonathan’s vision centers on expanding language support and deepening integrations with compilers to build a universal, lossless semantic model of software. Surprisingly, recipes built for Java have proven reusable across Python, JavaScript, and C# due to shared underlying abstractions. As Moderne continues to grow its “data lake” of code intelligence, the platform becomes foundational for spec-driven development, documentation generation, and AI-assisted transformation across the software lifecycle.
Conclusion
Moderne, with OpenRewrite at its core, is reshaping how enterprise teams approach modernization. By combining deterministic refactoring with AI acceleration, it allows engineering teams to scale their modernization efforts while maintaining confidence and control. As LLMs mature and agent frameworks evolve, Moderne positions itself as the infrastructure backbone for AI-native code transformation.
Related Podcasts

LLMS:
THE TRUTH
LAUER
Can LLMs replace structured systems to scale enterprises?
24 Jun 2025
with Jason Ganz

CAN AI TOOLS
BE TRUSTED?
Can AI Tools Be Trusted with Security-Critical Code? Real World AI Security Risks
15 Dec 2024
with Liran Tal