Episode Description
Patrick Debois, the mind behind “DevOps,” joins Simon Maple to unpack the system-level shifts AI is driving across software engineering, drawn from what he saw firsthand at the AI Engineer’s World Fair. They also get into: • how inconsistent codebases confuse AI • why running agents locally is becoming obsolete • inside OpenAI’s concept of “model specs”
Overview
A Ground-Zero Event for AI Engineers
Patrick Debois reflects on the AI Engineering World Fair in San Francisco—a conference unlike others that bolt AI onto existing agendas. This event is exclusively focused on AI-native development. Drawing 3,000 attendees and leaders across the AI tool ecosystem, it’s become a proving ground for emerging ideas in agentic coding, developer workflows, and infrastructure automation.
Agents Take Center Stage
One of the biggest shifts Patrick noted was the sheer dominance of coding agents. Where coding was once a side topic at AI conferences, it is now central. Most tooling vendors at the event are shifting toward agentic experiences—autonomous, task-driven systems that go beyond autocomplete. While industry adoption is still maturing, the tooling space is rapidly aligning around the agent paradigm.
A New Paradigm Every Six Months
Patrick warns that using AI tools the same way every six months is a mistake. Agent-based workflows have changed the game, shifting from chat-based prompting to spec-driven automation and headless execution. Tools like Claude Code are pushing asynchronous, CLI-based workflows, moving beyond the IDE as the core interaction surface.
Specs Are the New Code
One key insight: specifications are becoming central artifacts. Inspired by OpenAI’s internal use of model specs, developers are beginning to treat specs not just as input but as the source of truth. With good specs, you can align teams, generate tests, and even regenerate implementations. Test frameworks like Gherkin and Cucumber serve as bridges between intent and validation, strengthening AI alignment.
Agents Move to the Cloud
As agents take on more complex and long-running tasks, developers are pushing execution to the cloud. Local machines can’t handle the CPU load or runtime reliability. Tools like Cursor now offer cloud agents that interact with Git repos directly, and developers are shifting to containerized, sandboxed execution environments to manage trust, permissions, and scale.
Parallel Execution & The Burden of Review
AI coding is also becoming parallelized—splitting work across agents for speed or variation. But while generating multiple solutions is easy, evaluating and merging them is still a human burden. The next challenge is developing UI and orchestration tools that can assist with review, comparison, and synthesis of parallel outcomes.
CI/CD Is Shifting Left—Again
Agentic workflows are triggering a new wave of left-shifting CI/CD. Tests, code validation, and environment replication are happening earlier, closer to the developer’s editor. With containerized environments and real-time execution, feedback loops are shortening and becoming more aligned with dev workflows.
How Much Productivity? It Depends
Claims of 5x or 10x productivity from AI tools vary widely. Patrick explains that effectiveness depends on task complexity, codebase consistency, and language popularity. Simpler projects or newer stacks may benefit more, while older, inconsistent codebases confuse both humans and LLMs. Productivity gains are real but nuanced.
Constant Change, Constant Learning
The episode closes with a reminder: AI dev workflows are evolving at a pace unlike anything before. What works today may be obsolete tomorrow. Developers must embrace experimentation, stay flexible, and continually re-evaluate their workflows to stay ahead.
Resources
Patrick Debois- https://www.linkedin.com/in/patrickdebois/
Simon Maple- https://www.linkedin.com/in/simonmaple/
Tessl- https://www.linkedin.com/company/tesslio/
AI Native Dev- https://www.linkedin.com/showcase/ai-native-dev/
Chapters
(00:00) Trailer
(01:03) Introduction
(02:57) AI Engineering World Fair: Conference Overview
(07:38) Seven Key Lessons from the Fair
(08:51) AI Coding Agents Everywhere
(11:14) Evolving AI Dev Tool Practices
(12:23) Specs as the New Code
(15:00) Agents Running in the Cloud
(24:00) Parallel Agent Execution
(36:58) CI/CD Shifting Left
(40:55) AI Productivity Multipliers
(47:15) Outro
Full Script
[00:00:00] Simon: Hello, and welcome to another episode of the AI Native Dev. My name's Simon Maple. I'm your host for the day, and today we're gonna be talking, uh, uh, a lot about, um, a, a small little conference, which happens, uh, every now and then in San Francisco, which is called the AI Engineering World.
[00:00:39] Simon: You may have heard of it. And actually, one of my colleagues at Patrick Debar, who's been on the podcast. Gosh, countless times now, uh, visited the, the conference, uh, a month ago or so, whenever it was, and had some, had some really interesting TDRs and actually wrote a blog, uh, seven Lessons from AI Engineering World Fair.
[00:01:00] Simon: So we're gonna go through some of those and, and, and talk in depth in about one or two, uh, in particular. But, uh, but Patrick, welcome to the podcast once again. You're a regular here. You're almost, you're almost joining as often as I am. How are you?
[00:01:15] Patrick: I'm great. Uh, you can't really tell whether it's me or an AI though, but we're all good.
[00:01:20] Simon: We'll assume it's an AI unless, unless you tell us otherwise. How's that? Yeah, let's do that. Awesome. So Patrick, um, so Patrick, what, what, what do you do these days? I. [00:01:30]
[00:01:31] Patrick: Well, I'm having fun, uh, playing around with the whole AI craze, especially now in the space of, uh, coding and how it actually is improving the SDLC.
[00:01:40] Patrick: Um, I feel like I'm discovering new ways of working every day. Uh, and it's great like, you know, on the community side and. Everybody's sharing. It's, uh, it's, uh, it's a lovely time to be alive.
[00:01:52] Simon: Yeah. And you make a lot of contributions to the Iron Native Dev. Um, one of the, you know, major contributions obviously was the landscape.
[00:01:59] Simon: The tool that allows you to effectively keep track of a bunch of AI dev tools. That was one of, one of your kinda big projects into the AI native dev. One of the blogs that you wrote, as I mentioned, was the seven lessons that you learned from the AI engineering world.
[00:02:18] Simon: Tell us a little bit about the AI engineering world. First of all, it's a conference. Uh, it's in San Francisco. Not one of my most favorite places that I, I tend to go, but let's talk about that as well. Uh. What, what's the conference like? What's the vibe of it? Like what's the, uh, maybe I shouldn't say vibe these days, but what's the, what's the style of the conference?
[00:02:37] Simon: Who are the typical people who go, what's the type of sessions that are that are there?
[00:02:41] Patrick: Yeah, so these days, um, you know, when everybody's talking about AI. I found it really hard to find a conference that actually is on point. There's a lot of events that slap on more AI. You know, I'm at a traditional developer conference and oh, we'll have some talks on AI, or we are already doing machine learning, so we'll do more AI there.
[00:03:04] Patrick: So there's a lot of events happening out there, but I feel that it's bolted on and kind of like. Explaining a little bit more of what it does.
[00:03:15] Simon: Mm-hmm.
[00:03:16] Patrick: AI uh, engineer Fair was actually one of the first events that took it down center like this is AI native, this is what we're doing. Right. It wasn't about all the things of the past.
[00:03:29] Patrick: It was. All the new things. And I think they've been running this now, like for two years, two and a half years. And every time the event gets bigger, uh, it's uh, it was organized by, you know, initially it's still the same people, but uh, like from Swix. And he also kinda coined the term, the AI engineer, and kind of grew from there.
[00:03:51] Patrick: So it was like a label AI engineer, and they grew the conference. Now, the cool thing about this, uh, being, um, so much on the spearhead of the new technology is that all the new tools, all the new people, all CTOs, CEOs of those new tools are still accessible because they all wanna be there. They all wanna learn.
[00:04:15] Patrick: This is where the head of the industry is kind of going. That's why I love going to these events because it is not just a rehash of whatever we are doing. This is kind of like we're emerging patterns and things learned. Obviously it’s always kind of educational for people who are not that known with this, so they have workshops, but if you go in.
[00:04:36] Patrick: Their level of content, it is forward thinking. It is kinda like, uh, things that are on the edge. Uh, and I think they also have no shortage of content because their stream, most of this just live for free. So they recognize everybody, you know, needs to be there because it's all about the community and the chat that you're really learning.
[00:04:58] Patrick: All the lessons from the videos are for free. Um, and it's just amazing how much energy. You have when you come into that room. It's pretty big. I think they had like roughly 3000 people right now. Yeah. So, which is, you know, not bad for a two years old conference. Mm-hmm. Uh, but yeah, anyway, they got the support from the whole industry, uh, from big players, uh, all around.
[00:05:21] Patrick: And uh, and that's why I love going there. It's not because I like San Francisco. Yeah. You know, I gladly make an exception for this event to go.
[00:05:28] Simon: It's interesting actually, 'cause of a lot of events. In recent years, five, 10 years or so, actually moved a little bit away from San Francisco. But these days, AI is obviously, you know, when you look at a lot of the investment and a lot of the startups, they do tend to focus very much at the, at the core of, um, of San Francisco for, for, you know, the Silicon Valley space.
[00:05:50] Simon: Um. So it kinda makes it more natural for an AI engineers conference to kinda like exist there, which is I guess why they, why they keep it there. But it'll be, uh, it'll be [00:06:00] interesting to see, uh, if they, if they at any point do look, uh, to, to other places. I think earlier, uh, this year they had, um, they had the AI engineers summits in New York as well.
[00:06:13] Simon: Right. Which is another interesting, uh, that
[00:06:14] Patrick: one was more like, um, they called it invite only. I think it was still accessible in a way, if you really wanted to, you could come. Mm-hmm. But it was not that, you know, the ticket sales were like open to, to everybody. Yeah. And that's more of a regrouping of that like a smaller group forward thinking.
[00:06:33] Patrick: So the, for example, where in the past they had like many tracks like, uh, you know, rag and, and so on. It's just that. That whole conference is gonna be about agents, so they kinda like to focus on this really hard. Um, so one more invite, only one open to the very broad public. And I keep hearing rumors. There might be one happening in Europe as well, maybe in fall.
[00:06:56] Patrick: So definitely they're thinking about doing this closer for us, uh, closer to home. So,
[00:07:01] Simon: yeah. Yeah, absolutely. Um. Let's jump in. Uh, there are, there are seven learnings that you had. Should we, should we name them very, very briefly? So one is about AI coding agents being truly everywhere. The second one is about how we need to change the way we think about using AI dev tools today.
[00:07:19] Simon: The third is about how specs are turning into the new code. The fourth is about how agents can be running in the cloud. The fifth is about parallel execution and how you can [00:07:30] effectively, uh, explore. The results of AI in, in, in parallel. The sixth is about CICD and is that shifting left? And then finally the seventh is about the productivity gains that people can get from using AI.
[00:07:47] Simon: And how many X's will it deliver? Will it be a two x, a five x, a 10 x? We'll see. But let's jump in with the agents everywhere. And it does seem, in fact, there's a TLDR that I loved at the top of your blog, which is, I'll read it out. Cat ideas, pipe it into specs. Pipe it into parallel agents, which are, which are running asynchronously.
[00:08:06] Simon: Pipe that into your review and then pipe that into three dot profit as normal. Uh, which I think is a nice TLDR. But let's jump into that first one. The, the, the first learning. AI coding agents are truly everywhere. And this is really about how everyone is trying to effectively have an agent mode trying to showcase that, that their tool, uh, is now supporting an agentic way of, of, of working.
[00:08:33] Simon: Um, talk us through what you, what you saw there.
[00:08:35] Patrick: So one, one of the things to note on the AI engineer is that, um, in the last couple of events. They were talking about agents and RAGs and embeddings and fine tuning and all that stuff. So the very broad AI engineering. Mm-hmm. For me, this event, uh, was the first one who had, um, a large percentage of the talks focused on coding, where in the [00:09:00] past coding was like.
[00:09:01] Patrick: Ah, yeah, it's there. Like, you know, you use kind of a copilot and that's done, but because they've all become so powerful, it felt like the tracks, there was like a full track on coding agents and, and more of the bigger players were announcing this. So that was truly everywhere in that sense. Yes, and the explanation is often that, you know, you can look for a lot of use cases to use more AI, but coding turns out to be this amazing, like, valuable piece to use AI for.
[00:09:31] Patrick: So that kind of realization. And it also shows in kind of like what people are looking for, uh, in that track. So truly everywhere, coding agents is not just something on the side that we do, it's just dead and center. Also, at this event.
[00:09:45] Simon: Yeah. And, and is it, is it also showing that as users we are trusting AI much, much more?
[00:09:50] Simon: 'cause we are leaning into that autonomy. What, what, what's, what's the signal?
[00:09:53] Patrick: I think that's still, uh, the jury is out, so all the vendors are showcasing what they can do. I think that like, uh, I hadn't seen much of, um. I am a company and I'm introducing all these tools and that's the value I got from, so that's probably gonna be next year the more it kinda like, uh, enhance that.
[00:10:15] Patrick: I think it did New York one. That was one interesting talk about how Booking.com kind of introduced us. So that was like the biggest story that I heard. Besides that it's all about the vendors kinda like defending and saying how they improve this.
[00:10:34] Simon: Yeah. Okay, cool. Let's go. Let's jump into learning number two. Interesting one. Um, just in six months we are saying what, what, what, what this, uh, post is saying here is if you use AI tooling, uh, like we did six months ago, we're making a mistake. Um, what's the, what's the major difference? Is it agentic again?
[00:10:53] Simon: What's the, is it, is it the fact that we need to run in a more agentic way? What's the, what's the, what's the call here?
[00:10:59] Patrick: Mm-hmm. So, um, I would say that. In the past when we have a new technology, the cloud, mobile and so on, we see that initial spark and then we can say, oh, we just need like a few more cycles to kind of like polish, and improve this.
[00:11:15] Patrick: We can see where this is going. With AI coding, it feels that every six months we'll have to reinvent the way we actually use this. So that's, that's. One of the points that was made from the search graph says like, you have practice, like, oh, I can do completion, alright. Oh no. All of a sudden I have an agent, so I'm doing prompts now.
[00:11:38] Patrick: So kind of that, and then after the prompts, somebody said like, okay, we're doing agents, but now it becomes about specs. So we see that kind of reframing. Is important. And so one of the biggest mistakes is actually that you keep using the completion and say, I got it right? So you have to kind of keep embracing like new possibilities.[00:12:00]
[00:12:00] Patrick: Buy the new kind of technology that's being embedded in the tools. And we're still figuring out what the ideal kind of workflow looks like. But every time, every six months we're unlocked with a certain new set of tools, uh, kind of a new way of working, which is great. Uh, in that perspective, there were some interesting, like smaller tidbits on, for example, uh, open hand saying, um.
[00:12:25] Patrick: You know, in the past we would say agent runoff come back to us and, and, and we see the pull request of an agent, uh, happening and agent A does this, agent B did that, but they found out it wasn't about, um.
[00:12:41] Patrick: They needed to actually, to have a human that was responsible for their actions. So they started changing the way that the PRS were actually on behalf of a user because they found out that like, people respond better to a user than to an agent telling us what to do. Right. Well, it could be another way around as well, but they kind of turned this into that.
[00:13:00] Patrick: It was always like that. This agent did that on behalf of that user. So that was a smaller thing, uh, obviously, uh, to the door. And then like the other thing was, oh, we, we just got used to using it more in our IDs and other things, and all of a sudden we are switching to more terminal based CLI tools. Uh, maybe that's the part where it's more autonomous, synchronous, because it's happening somewhere.
[00:13:24] Patrick: It runs. We don't have to see that much anymore and it comes back to us when it's done. Uh, [00:13:30] one of the newer, typical features from those AI things is that they have a notification. We all know notifications, we hate them, right. Or social media or internal chat systems. And now the agents will send us like, Hey, we're done.
[00:13:43] Patrick: Like, you need to do something. Because that un synchronous thing is also kind of changing how we interact with them.
[00:13:49] Simon: And, and it's really interesting actually, because yeah, you mentioned. Obviously things like Claude Code, you've got Codex, you've got the Gemini model.
[00:14:12] Simon: So TUI, your TT, um, you get things like Claude code, obviously that. It really does. It's one of the, it's one of the first ones that really does change the, well, first tools that have gone, gone kinda like mainstream to really change the way in which developers code.
[00:14:33] Simon: Because it's entirely terminal based. A lot of the way that I see people use that though, they don't just stay in Claude code and continue to, to, to, you know, uh, prompt and only stay. In that terminal. A lot of people that I know of, they use Claude Code, make the changes, then they. You know, see the changes, not in the terminal per se, but in their typical IDE.
[00:15:01] Simon: And so they're bouncing between the two. So while we look at things like Claude Code being headless, it's not the headless experience that we're always, that, that, you know, we stick to. It's the tool that is headless and we're actually still, uh, sitting in our ID of choice to, to really. Uh, if we want to look at the code more and more to, to be able to traverse the code, to understand what the code's doing, to be able to do those reviews, and then you effectively use whatever it is, whether it's GI or something like that, to effectively do your rollbacks and, and, and, uh, maintain state.
[00:15:32] Simon: Um, of your, of your versioning, is that, would you say that's fair across, so remember that we
[00:15:38] Patrick: Talked about you using the old paradigms from six months ago? Yeah. And then, six months ago we were chatting, so having a chat with an agent was a very synchronous thing. Mm-hmm. So you could definitely use a terminal to do the same thing.
[00:15:51] Patrick: You just chat and you kind of have this happening. Now. People have been saying it's the end of the IDE. Um. You could say maybe it's the end of the code editor, but you'll still have to do the review so you somehow need UI or some way of explaining things as you mentioned, like when it's done, you kind of like have to visualize and see whether it's, it's okay.
[00:16:14] Patrick: So it's, it's, it's not completely headless, but it could go off if you would say, if you really embraced like the, a synchronous coding way, it would give it a larger chunk. Of tasks and it runs for a couple of hours and then it comes back to [00:16:30] you. So, mm-hmm. You, that's the same thing. Uh, like, but that means you'll have to specify more upfront than just that simple task or that one thing, or otherwise it's more guessing and you don't have to stay in that loop.
[00:16:42] Patrick: So that's why it is, you can use the technology. In a more synchronous way. But if you embrace asynchronous way, you start doing more of those kind of like specifications, uh, and bigger tasks that you need to do.
[00:16:57] Simon: And talking to specifications, that's your learning. Number three, how specs are the new code.
[00:17:01] Simon: And I think this is kind of like. We're seeing this being mentioned more and more, and I think specifically at this event, it was Sean Grove who works at OpenAI and he was talking about a number of things called, uh, model specs and things like that. Um, AGI give us a kind of brief overview and we shared this video of course behind the scenes as well 'cause it was very on the mark for us.
[00:17:26] Simon: Tell us a little bit about how OpenAI is thinking about model specs and specifications.
[00:17:37] Patrick: so they talked about a spec being important to guide a model for certain generations. And internally they have a concept called a model spec. And in that model spec, they say things like, you should always be friendly to what you're generating. It should be also, it's on point. If you don't know the things, you should do this.
[00:17:58] Patrick: So they kind [00:18:00] of, I. The example he gave was more about training a model and making the model actually respond to those things. But the same, you can actually then apply to not training the model, but asking the model what to do. So if you give it specs to do a certain task or to code it, you kind of like.
[00:18:19] Patrick: Give you that longer or that bigger chunk. And it does it, if you want to change something, it's not about like the kind of, uh, disposable prompts that you never record. You just change the spec and you generate from that again. So that kind of reasoning for model training with specs and then code training or code generation with specs kinda started to coincide in that, uh, kind of presentation.
[00:18:44] Patrick: And what was really nice is that he said, um, I. It actually, they use it for alignments of the models and the AI, but what he also said was. Writing specification actually aligns humans because you have to agree on what the task is at hand. So there's a benefit from both sides and human and not, I explain this sometimes as what you're describing from a model is almost like your code of conduct of your model.
[00:19:13] Patrick: Like this is how I think you should behave and you kinda like bake that into the model, but very similar for your code. This is how I. Want you to write my code. This is my directory structure, this is my language, these are my coding conventions, this is my process that I'm doing. So you can all put all those things in the specs.
[00:19:31] Patrick: And so hence he just, you know, like concluded, like the specs are bigger than code because if you change the spec, you can generate the code from there. And that is very much aligned to our like, spec driven development that we often talk about. As a way of working. So it was a surprise coming from that angle, from OpenAI and model training and then all of a sudden like, you know, kind of crossing the chasm to code generation and saying, actually it's, it's there, uh, valid as well.
[00:19:59] Simon: Yeah. And I think you mentioned in the blog as well how tests play such an important role in effectively being able to validate. Specification. So obviously a specification is we are gonna need to have some implementation created from that. And it's the tests that are written that, uh, that need to, need to validate that, I guess some would say, okay, the tests are part of that specification.
[00:20:25] Simon: Um, and some would say actually the tests are actually more important than, as a result, than mm-hmm, than the code. Uh, because they, they're, they're proof of your intent. Um. How much, how much was that kinda like mentioned? 'cause I feel like that's a very crucial part.
[00:20:39] Patrick: I think you said, uh, what isn't, uh, in like, if it isn't specified in the specification, that behavior, um, then I.
[00:20:51] Patrick: It's actually, it shouldn't be important. So if you really need it to be written down, it needs to be in specification and it be, but we all learn from executable documentation. Remember that word? Yeah. Kind of a couple of years ago. Yeah. And it's the very same thing. So if we were to write like the cases that actually it should work with.
[00:21:09] Patrick: That are the tests, and they're an integral part of kind of examples. You know, obviously in AI you would do this, the multi, uh, kind of examples, uh, in your prompt, but this is very similar to your tests being described.
[00:21:23] Simon: Mm-hmm.
[00:21:23] Patrick: Now, I personally believe that this helps actually with the generation because, you know, the actual LLM will look at the, the, the tests or the examples and try to figure out whether that works or not.
[00:21:36] Patrick: Now. The jury's still out. Whether we can trust this completely, but that's a different story. But the fact that we can now also, it generates the test that still relieves the human to actually verify whether the tests were generated, whether they were correct and uh, actually. The specification will help again, with that alignment as well, because now we can see the examples in the specs.
[00:21:59] Patrick: So the human can also kind of verify whether the behavior was correct.
[00:22:04] Simon: Yeah,
[00:22:04] Patrick: in both
[00:22:05] Simon: ways. It was interesting actually because I, I, I met up with, uh, Baruch, who I, we had on the podcast, uh, a couple weeks ago. Um, and one of the things that he was talking about was around using an existing framework like, um, like.
[00:22:18] Simon: You know, Gerkin and Cucumber. So Gerkin specs and Cucumber. The tool that allows you to create that BDD style approach. His approach to something like this is to be able to create, a set of test descriptions effectively, in the Gherkin format that could then be validated, as you know, natural language.
[00:22:42] Simon: And so you can kinda like, look at it and say, yeah, I agree. I agree. I agree. And, and you're effectively, it's effectively part of your specification then, right? 'cause it's, it, it shows that intent and then using something like cucumber to then turn that into tests. There's no LLM, there's no, what's the phrase?
[00:23:04] Simon: It's, yeah, there's no LLM, it's, it's, there's no non-determinism, you know, so long as you get it right and you are given that, given this, then that, so long as that's right, then you will get those tests that actually, uh, that actually perform that and you have that greater level of, uh. Of, of confidence that those tests are gonna be, but would the
[00:23:25] Patrick: BD developer actually wrote that code, right?
[00:23:28] Patrick: Correct. Like, you know, fulfills that test. So it is the business person kind of describing the business functionality. Yeah. Tends to a behavioral test. And then that gets implemented. Now you can use that same kind of language to specify things in your examples. For example, uh, no pun intended, but the, uh, then the LLM just generates from that, the actual code, uh, and something that adheres to the BDD test that you provided.
[00:23:59] Patrick: It can get it wrong. So we're not there that it's a hundred percent. Mm-hmm. But it actually makes that mapping even more seamless. Like, oh yeah, I don't have to write that code. I can bootstrap the code and eventually have almost like perfect code to run this, uh mm-hmm. And see them execute.
[00:24:15] Patrick: Mm-hmm. Uh, from the code that helps. And so in that loop that an agent goes through. It wasn't what we're looking for for completion or the initial chat, but with an agent you can go into a loop and say, oh, it didn't work. Let me look at the, or error, um, kind of let me run that test again. So that kind of loop, uh, is now more, uh, has more context to what it actually should be achieving there.
[00:24:37] Simon: Yeah. Cool, almost halfway now. Um, four. Number four is agents to the cloud. So this is about how people aren't using agents, uh, in their kinda like local, uh, local machines anymore. This is more about how they can be pushed to the cloud and then executed, uh, somewhere further away. Um, talk us through, I guess, is this something that you're seeing fairly, uh, fairly commonly?
[00:25:08] Simon: You know, rolled out across different tooling vendors and different, uh, providers.
[00:25:13] Patrick: So the, the struggle you have with more of these sync longer running threads is when you close your laptop, it goes to sleep.
[00:25:22] Simon: Mm-hmm.
[00:25:22] Patrick: You have some annoyances there. So that was one thing, like, you know, I have. Longer running and it will not keep running.
[00:25:30] Patrick: The second one is if I run these agents and they, they do a lot of CPU work, it's hogging my machine, so that's also kind of annoying. So, you know, how do you solve that problem, uh, as well. And then the third one, what, which you start more and more appearing is if you're doing a parallel, then it becomes even more constrained to do this.
[00:25:51] Patrick: Mm-hmm. Now, where do you start like seeing this in a tool? Maybe the first one you could say was, um, from, uh, um, the, the AMP code, uh, was doing code indexing, um, from your embeddings into the cloud. If you have large code bases, your machine doesn't have a GPU, it takes a lot of time to do all that local indexing, so they're resorted to moving that to the cloud.
[00:26:20] Patrick: Now, that's a minor thing, but it's definitely helpful for bigger code bases there as well. What, then you start seeing, uh, tools like Cursor, uh, you know, trying background agents. And so they have a hosted version of that like an execution thing. Uh, and now all, all the others are kinda like following suit that like you can have the option to either run the agent locally.
[00:26:43] Patrick: Or run it on a remote machine. Now there's a bit of a challenge is how do we actually replicate the environment of your dev environment somewhere in the cloud? So that the first attempt was, let me almost like. Create a virtual, [00:27:00] uh, like a shadow environment from your coding environment and we'll code on that.
[00:27:05] Patrick: So that was one of the first tries. And it becomes immensely complex because like what's all the things that have been set up and you just can't say clone deaf environments are so specific, so kind of annoying. So that's where people move more into, hey, but we have this technology called containers. So we use that in our CICD.
[00:27:25] Patrick: So let's kind of run these agents in those kinds of more sandbox environments. And then obviously that became like the logical thing to do, either in a local sandbox, in your container, or somewhere in the cloud. Now that means they're mostly shifting towards. Following the pattern, I have an existing code commit and I wanna do an update on that commit.
[00:27:48] Patrick: Uh, and then I ask the agent to do a certain task on that commit. Right? So what they do behind the scenes is they say, here's the commit. Oh, let me kind of, I. Do a get pull from the repo, check out that commit, run my agents in the cloud, and then I kind of like create a feature branch and present that back to you.
[00:28:07] Patrick: So that's kind of the flow that all these tools have been following to make it a more lot easier to have like more reliable workflow than trying to clone your desktop, which is kind like insane and would not work. Mm-hmm. Um, so it's definitely there more and more in the tools now. Running agents is not also without any, uh, issues.
[00:28:27] Patrick: Um, like you have to do the permissions. And if you're still doing the babysit version of yes, no agree, that's great, but. That's for a long running threat, that's gonna be annoying, right? So if you can kind of control the network, which tools it has access to, you can control that way better in a sandbox environment than trying to run on your laptop as well to do that.
[00:28:51] Patrick: So those are like two reasons why people are moving towards the cloud and it's like the major coding tools are doing this. Now, whether your whole IDE will move to the cloud or we still have an IDE, that's a different conversation, but you start seeing, uh, I think, um. Cursor just released like a mobile, uh, view under editor so you can kind of code from mobile.
[00:29:13] Patrick: So you see kind of something is shifting again on how we deal with this and, and how it reports back to you when it's done. Um, in there as well.
[00:29:21] Simon: I didn't see the mobile. Was that, was it the mobile cursor? Uh, view, was that, um, was that also at the AI Engineers Summit or was that more recently? No,
[00:29:29] Patrick: no. It's uh, it's something that was released like last week or something.
[00:29:31] Simon: Oh. Oh wow. I didn't see it. I missed that. Wow. It, it, it's such a fast moving space, right? Yeah. It's, it's very hard to, it's very hard to catch it all. Um, very, very interesting. And, and one of the things that, that kind of like opens up, as you kind of mentioned there was, was parallel. Um, obviously if you're trying to run multiple agents in parallel, it's gonna be a real consumption hog for your, for, you know, your local machine.
[00:29:51] Simon: And that kinda leans us into. Your fifth learning, which is around parallel execution. And there are, I guess, a few reasons you might want to do [00:30:00] something like this. And, and, and in your, in your blog, you mentioned a few things that were kind of mentioned. One is a really around breaking a task up, uh, into, into subtasks and having multiple different agents, uh, work on those subtasks.
[00:30:13] Simon: And I think, I guess. Somewhat, you could actually, it's nicer to almost have different, uh, models, I guess, potentially working on those, depending on what the right model is for that, um, for, for that task. Um, others you might have, um, uh, just a variety. So let's, let's have three perhaps different models, well, maybe the same model running something three uh, times in parallel.
[00:30:37] Simon: Then trying to work out, okay, which is the, which is the best one that I want to go with? And it really does open up this, uh, this new world whereby, because we are with an agentic coding waiting longer, we can afford to kinda like say, yeah, run a few. Let's try and now get the best answer. Um, and, and I don't mind if it takes an amount of time.
[00:31:00] Simon: Just come to me when you're done and we'll, we'll, we'll sync up what's, um. You know what, what's been created? It really opens a new world for us, right?
[00:31:09] Patrick: Yeah. Yeah. So you're right, like two pieces. One is foster delivery or split this up apparel. We actually do this intuitively, you know, without agents like mm-hmm.
[00:31:19] Patrick: Let's have five developers, let's split up. The features kind of go from there and then, you know, they can work and merge in and be more independent of each other, like. Now the planning is actually [00:31:30] done by the coding tools and they kind of try to split this up. Now it is challenging, which.
[00:31:34] Patrick: Actually, you know, I think it overlaps with another tool because they have the same problems as humans have, in the end you need to do the merging of all those parallel things. Uh, you could use things like stacking or like a few of those changes together. Um, but yeah, that's, that's the first reason why people use this and, and they have some challenges.
[00:31:54] Patrick: Mm-hmm. Uh, the other is variations and it could be, uh, things like, um, I wanna. Explore three different ways of solving this problem, right? So use this framework, use that framework, use that framework, or come back with three hypotheses why the performance is bad, like and explore three of them. So could also be for performance or support reasons there as well.
[00:32:18] Simon: So, so I like that you can almost say to the LM that created this, okay, why is yours better than these others? So almost like, almost like argue between each other as to which is the best, or maybe there's some. There's some, uh, you know, spot in the middle where actually it's maybe less performant, but actually there are other things that, that go for it more reliable or whatever it is.
[00:32:39] Simon: Whereby actually they can come to a decision based on their own discussion, their own arguments as to why each is best. It's like the, the, the, it's bizarre how great this could actually turn in.
[00:32:53] Patrick: Yeah, but that's, that's the challenge, right? So imagine. Uh, as a human, we have five people doing coding and the different tracks.
[00:33:02] Patrick: There's one kind of orchestrator or something or manager who's disabled. This is the best.
[00:33:07] Simon: Mm.
[00:33:08] Patrick: Um, now in most of the tools you'll find that they're all about creating all these variations, but actually, that brings the burden on us. You know, kind of like managing, which one is the best.
[00:33:21] Patrick: Now we could use maybe some help. From an LLM who says like, I think based on the input that the other agents gave me, I think this one is the best. So, or it could help us also show different diffs, like mm-hmm. This kind of application that likes, or gives measurements. Now I feel that. This is a field where kind of we're moving into, like now we have parallel execution.
[00:33:47] Patrick: Now we need a way better way of doing that review or parallel review. Now, some of the tools are starting to build a new UI in a way that is not showing one version of the application, but it allows you to compare three versions of the application, uh, kind of more in parallel or tree, kinda like, um, render things.
[00:34:05] Patrick: Uh, and, and we'll see more of that. Um, you know, I was once, uh, talking with one of our colleagues, Tom, and, and we came up with the idea like, we often ask to do vibe coding, you know, to kind of build an app, but what if the agents actually can vibe, code an application to do the review? I. Hmm. Because then it becomes easier to kinda like say which one is the best?
[00:34:27] Patrick: They come up with the best interface to [00:34:30] actually check whether, like what's the best that we should select. So there's a lot of exploration to be done there. Yeah. But the burden of reviewing still stays on. So anything we can do. Improve that, reduce cognitive load, help us understand that's probably the next phase, uh, in, in all this journey, uh, of going there.
[00:34:52] Simon: yeah. And, and the developer usage, you kind of mentioned in the post as well that, um, Solomon hikes the CEO of Dagger, one of the things that, that they showed was how, um, you can effectively hook up, uh, an MCP server or you can invoke, uh, through your IDE an MCP server whereby. If you say, oh, I, I want some different variations for this, it'll go away and do that from an MCP server.
[00:35:17] Simon: And actually when you pull the changes back that it's already in a branch each, each variation's in a branch, each. And so you can kind of like click through, have a look at them, and then merge, uh, merge the, the, the branch that you want to accept. So from a, from a user point of view, it's actually very, very aligned with potentially the way that developers want to consume.
[00:35:39] Simon: These changes. It's not some completely different UI or anything like that. It's nicely within the, their, their, their id, which is wonderful. Yeah.
[00:35:47] Patrick: And I think that's, that's also a journey that we've seen. Uh, so. Last year I saw quite some tweet about let's open up, uh, five different IDE, uh, windows and have them all code on the same code base.
[00:36:03] Patrick: This asked about their chaos, you know, obviously from an experiment. Uh, then the next kind of phase was, well, what if we clone the directory to kind of like, and have them work on separate and kind of merge that back in. So that was the whole idea of uh, get up work trees and get work trees, uh, to kinda like have different.
[00:36:21] Patrick: Uh, agents working on each other, and then we said, sorry, that, um, the environments weren't isolated enough because you needed to control more, and that we put in, in a container, and then ultimately we, we put that into the cloud there as well, so that, that parallelization has been going on for a while. And what you see now is that when you run it in the cloud, how do you actually merge this back in?
[00:36:45] Patrick: Uh, cursor has this, uh, behind the scenes is not where very well described, but they call it like, uh, the remote, uh, uh, kind of Cursor protocol, which is emerging from a remote server back to your local branch. And kind of Dagger has a similar system where it actually exists in a container, but you are almost as simple as a get merge, get like whatever was in the container back to your.
[00:37:10] Patrick: Local code, uh, thing there as well. So again, another learnings and how it's progressing, uh, on like all bits and pieces to come from intent, kind of break it down to task, which we did like parallel and then kind of do this app parallel execution and now we're up to the reviewing as kind of the issue whether we like it or not.
[00:37:32] Patrick: So that's the next frontier there. So
[00:37:33] Simon: yeah. Very interesting. Looking forward to seeing where that goes. Uh, learning number six. Let's move on to learning. Number six is CICD. Shifting left. This one baffled me. Uh, tell me how CICD can, can, can shift left Patrick.
[00:37:48] Patrick: So, um, with all those agents kind of coding more, um, and having specifications, having examples, um, they do a lot more testing.
[00:37:58] Patrick: Then maybe what we con would continuously do. So that means that like whenever you're still. On your local branch, you can spin up the same CICD and have the agents kinda like, almost kind of run that. So in that perspective, it is shifting even more left, like not all the testing has to be done on the CICD system, but we can do like, you know, code analysis or other stuff now.
[00:38:22] Patrick: This was possible in the past as well, but now that we've moved into the parallel execution and the parallel thing, we, we, we, we get that to a bigger extent where in the past it was always like, yeah, in the CICD, it has the real environment. It has the real setup. What if we can just like take that simple system and have the agents have their own CI environment and do the same thing.
[00:38:45] Patrick: So we got way faster, faster feedback as a developer and don't have to wait until the whole merch is complete, until it kicks off. So that was the whole idea of CI/CD, uh, is shifting left kind of faster feedback to the developer even before anywhere, like merging this into one of the main systems or, or kind of the staging system, so on.
[00:39:06] Simon: Yeah. Interesting. And, and, um. You kind of also mentioned that, uh, Josh Albrecht, CTO of Imbue, uh, was talking a little bit, uh, about how, uh, specs with strict guidelines helps catch these issues earlier in the in, in the value stream. And I think this is all about. Um, the kind of intent, making it easier, being able to really clearly describe intent, makes it easier to be able to validate whether your implementations adhere to that intent or, or don't adhere to that intent.
[00:39:40] Simon: Is that, is that, is that fair? And is it from a, from a spec point of view, is it easier for us to kinda like, generate tests and, and, and uh, really build out the tests that are important because we are very clearly defining intent?
[00:39:53] Patrick: Yeah, I think, um, when we. Code. Traditionally we take specifications either from, you know, your product manager or what it's supposed to do, and we break this down into actual code.
[00:40:07] Patrick: Now imagine you would then run a security scanner or something on that, and you just run this on the code. The code is just an implementation. As such, imagine you would actually know what the intent or the intended behavior was. It could do a way better job at understanding what it needs to check.
[00:40:26] Patrick: Mm-hmm. So it's almost like we throw away that information, like in a game of telephone and then assume. The next kind of agent, just like our tool knows it. So if they can all refer to that same spec, they have a better chance of, it's kinda like actually understanding what goes on. So, yeah. Um, it, it, it's funny because it's, it's actually almost like a, you know, when you're making a movie, you, you, you kind of remove all the cuts, you remove all kind of the, the takes, and then you only have the last thing, and then you don't know anymore how actually the movie was made because.
[00:41:00] Patrick: But when you need to do it again, you're like, what, what lightning did I use? Or like, what, what, like, okay. So kind of very similar to throwing away metadata, uh, yeah. At the end of the pipeline
[00:41:10] Simon: and, and, and going back to, I guess. Your third learning there. When you talk about specs of the new code, one of the things that Sean was mentioning is how specifications align humans.
[00:41:20] Simon: And what we're really talking about here is it's not just aligning humans, it's aligning, aligning agents through the correct, through the, through the workflow to make sure they all have that access to that, that core intent data and the core, the core data that we're, we're using, um, throughout the creation process.
[00:41:36] Simon: Last one, Patrick. Um, how many X's will AI deliver? And we're not talking about the, the, the funky, I'm still gonna call it new, the funky new name brand For Twitter, we're talking about how many times, uh, a developer or how many, you know, X's in terms of 5x, 10x, 20x, productivity gains are we expecting to see?
[00:41:59] Simon: Are we, are we really looking at the kind of 5x, 10x, 20x, style claims, uh, that people are making? Or, or is this overblown?
[00:42:11] Patrick: So at the event, uh, there was a researcher, um, kind of explaining, uh. What they found, that influence is actually that X. And it could be if you're doing a simpler task, right?
[00:42:25] Patrick: Of course. The AI is better at this. And so that gives you a performance boost. So if you're doing that repetitively, you get like an X that is bigger if the task is more complex. Right. Uh, it could be that it's helping, but that it's also requires you to tune and kind of correct the agent right now way more so it becomes a balance between how much effort do I help, how much effort do I actually know?
[00:42:54] Patrick: So that kind of influences that. The other point that he made is, uh, if you have a code base that has been long running, there over the years, there's different standards, different agreements, different terminology. It confuses the AI, much like it confuses humans, I guess, but that means the effectiveness goes down.
[00:43:14] Patrick: If I ask a question, well, it says like, well, were my coding standards. Well, I. You had like seven different, over like five years. Uh, so that inconsistency actually also influences that, uh, obviously the bigger size, but that seems to be overcome by having bigger [00:43:30] machines to index the code base, uh, and kind of find the relevant pieces that need to be changed.
[00:43:36] Patrick: Is a factor, but it's, it's maybe a less, uh, thing. And then the last one he mentioned was the language on which the LMS are trained. Uh, is that actually, you know, a more popular language? Is it a niche language? So it definitely also influences the code generation language now.
[00:43:56] Patrick: Mm-hmm. All of that is always a snapshot in time and all these points stay valid. But then we saw that it was not just about the LLM just doing one code generation. It was going into a loop, and then the loop got documentation, got specs. So there is a difference, like definitely a rise on this, but it's not like an excess or something.
[00:44:20] Patrick: Um, but it is, there is a certain influence that depending on your use case, you might not be getting. What you others are saying about their code base. If it's like a small, new, modern language, kinda fresh, it's completely different from a more legacy enterprise or kinda like, uh, tool base that you're working on.
[00:44:41] Patrick: So that was kind of his finding, uh, where he says most of those. Um, are you more efficient with AI? It is very anecdotal because they didn't kind of specify these aspects and then it becomes like, what are you actually comparing, that we're working on?
[00:44:57] Simon: Well, I think one of the big takeaways from this whole thing is really that, and I think you mentioned this in your closing thoughts as well, uh, and let me read from it.
[00:45:07] Simon: What is considered good practice now can change really fast and it's hard to keep up. And I think that's one of the, that's one of the core kinds of things, particularly in one of your, one of your learnings, which is around, uh, you know, using even six months ago, an AI tool in the same way as today.
[00:45:24] Simon: You'd, you'd actually be, you know, pretty. You know, doing it pretty poorly or not, not really using the most effective means of developing. So it's likely gonna change again in another six months. And it's really, really important to, to stay on track and to, and to keep learning because, uh, you know, today we have AIs and agents and cloud agents and things like that.
[00:45:46] Simon: Who knows where we're gonna be in six months. Uh, any, any predictions if I put you on the spot there, Patrick?
[00:45:53] Patrick: Predictions is, uh, the reviewing is gonna get a bump, and then after that it's gonna be knowledge management. Because if we get more automated, we have to kind of keep track of all the learnings that the agents are doing and kind of spend more time on that.
[00:46:07] Patrick: I, I understand it's very frustrating to have to keep checking what the tools do, uh, how it's actually improving. Uh, and a lot of people are dismissing, oh, I looked at tool X and it didn't work for me, but then maybe next day they kind of. Bumped and improved the whole version, or they did, did it differently.
[00:46:27] Patrick: And that's kind of annoying is that like whatever, and I often refer to this as the innovation tax. So you just have to pay that tax of having to try a new tool. But it's definitely different from previous waves of technology where you say, oh, it's maturing, and now it's like constantly changing how we work, which doesn't really help that kind of perspective, I guess.
[00:46:52] Simon: Yeah, and I think we as developers need to kinda always be open to that constant learning and never, never thinking. If I'm, you know, I'm using this today, um, I'm gonna stay on this for the next couple of years, it's just not possible right now. You have to absolutely keep looking and have to keep updating and, and, and, and taking on the new, the new features, the new ways of, uh, of, of doing things.
[00:47:15] Simon: Patrick, once again, it's been absolutely amazing to chat with you. Uh, really, really appreciate all your learnings, not just on the blog, which we'll share in the show notes, but as well, on our podcast. So really, really appreciate you joining and thanks, uh, thanks again for, uh, for supporting the AI Native Dev.
[00:47:33] Patrick: Yeah, my pleasure. And, also kind of as a listener, or viewer, if you have a cool story you wanna share or something that you know you learned, uh, do feel free to reach out or put in the comments or like, we're always kinda like. Willing to see what others are doing.
[00:47:48] Simon: That's great. Absolutely.
[00:47:50] Simon: Absolutely. Thank you very much, and thanks everyone for tuning in really, as Patrick mentioned, interested to hear if, uh, if you went to the, uh, AI Engineering World Fair, what did you think? What were your takeaways? And, uh, thanks for tuning in as well. So till next time, bye for now.