Podcast

Building AI Native Apps with Spring AI

With

Josh Long

29 May 2025

Spring AI will change how you build apps

Episode Description

What happens when the OG Spring Dev, Josh Long, teams up with Simon Maple on AI Native Dev? You get a front row seat to them building a Spring powered app that showcases the future of AI integration. While building the app, they discuss: • why developers are adopting Spring AI • how Spring is becoming the go-to for AI engineering • why companies route users from humans to IVRs • understanding model context protocol

Subscribe to our podcasts here

Overview

Introduction

In this episode of the AI Native Dev podcast, host Simon Maple speaks with Josh Long, Spring Developer Advocate at Broadcom, live from DevOps UK. Together, they explore how Spring AI empowers Java and Spring developers to integrate AI seamlessly into production systems, with a focus on practical tooling and architecture.

Introducing Spring AI

Josh introduces Spring AI as a new abstraction layer that allows developers to build AI-powered applications using familiar Spring Boot constructs. Designed to integrate easily with existing Spring-based microservices, Spring AI provides a unified interface for working with models, embeddings, and vector stores—enabling developers to embed intelligence directly into their applications without learning entirely new paradigms.

Building an AI Dog Adoption Assistant

To demonstrate Spring AI in action, Josh walks through building a chat-based assistant that helps users adopt dogs. Using OpenAI as the LLM, a PostgreSQL database for dog profiles, and PGVector as a vector store, the app supports natural conversations, retains memory, and performs similarity searches. Josh configures prompts and advisors to maintain context, guiding the assistant toward its intended use case and preventing misuse.

Tool Calling and Business Logic

A key highlight is Spring AI’s tool calling capability, which lets models invoke Java methods directly. Josh creates a scheduling service that lets users book adoption appointments. This logic is exported as a tool and made available to the model, showcasing how LLMs can go beyond Q&A to actually drive business logic via natural language. This marks a shift toward chat interfaces becoming primary UI layers in applications.

Embracing Model Context Protocol (MCP)

To make the architecture more modular, Josh introduces the Model Context Protocol (MCP)—a new protocol from Anthropic that connects models to remote tools and services. Spring AI offers first-class support for MCP, and Josh demonstrates extracting the scheduling logic into a standalone service. With this setup, AI agents can invoke external tools over HTTP while maintaining a clean separation of concerns.

Production Readiness and Observability

The episode concludes with a focus on making AI integration production-ready. Josh enables observability using Micrometer, tracks token usage to avoid runaway costs, and compiles the app into a native image for fast startup and low memory usage. He emphasizes the importance of understanding token economics and runtime efficiency to ensure that AI-driven services scale sustainably in real-world systems.

Conclusion

Josh and Simon wrap up by highlighting the practical strengths of Spring AI: familiar abstractions, flexible model support, and seamless integration of AI capabilities into existing Java infrastructure. With support for tool calling, RAG, and MCP, Spring AI positions itself as a powerful bridge between traditional enterprise apps and modern AI workflows—making it an exciting time to be both a Spring and AI developer.

Resources

Connect with us here:

1. Josh Long - https://www.linkedin.com/in/joshlong/
2. Simon Maple - https://www.linkedin.com/in/simonmaple/
3. Tessl - https://www.linkedin.com/company/tesslio/
4. AI Native Dev - https://www.linkedin.com/showcase/ai-native-dev/

Chapters

00:00 Trailer
01:06 Introduction
02:06 Spring AI capabilities
04:09 Where AI comes in
07:13 Demo rationale: adopt a dog
10:00 Project setup and dependencies
17:02 System prompt engineering
21:01 Spring Data JDBC Access
23:03 RAG with PGVector
28:00 Tool calling and scheduling
36:48 Extracting to MCP Service
40:10 Observability and production readiness
42:56 Outro

Full Script

EP55 - Josh Long

[00:00:00] Simon Maple: Hello, and welcome to another episode of the AI Native Dev. We're at DevOps UK today, and joining me is the wonderful Josh Long.

Josh Long: Hi buddy.

Simon Maple: Josh, how are you doing?

Josh Long: Oh, so good. I'm at DevOps UK.

Simon Maple: Josh, we go back like how long? 15 years?

[00:00:26] Josh Long: Too long.

[00:00:27] Simon Maple: 2010 maybe, or?

[00:00:29] Josh Long: Yeah. Yeah. I think so.

[00:00:29] Simon Maple: Yeah

[00:00:30] Josh Long: And, I've been a fan since 2011.

[00:00:34] Simon Maple: Was 2010 a bad year?

[00:00:37] Josh Long: No, I'm just kidding. Thank you. I mean, you've brought so much joy to my heart. But then you brought joy to the world. I remember the virtual JUG.

Simon Maple: Yes.

Josh Long: I just think about that all the time.

[00:00:46] Josh Long: That was such a genius idea.

Simon Maple: Mm.

Josh Long: A decade ahead of its time.

Simon Maple: Mm.

Josh Long: It took a pandemic for the rest of the world to realize.

Simon Maple: And that's still going there. There are almost 20,000 people in the virtual JUG these days.

Josh Long: Wow.

Simon Maple: Apparently.

Josh Long: Wow.

Josh Long: It's crazy.

Simon Maple: So, people know Josh Long as the Spring advocate, the advocate in JUG space, but, how long have you been with SpringSource, Pivotal, VMware, Broadcom?

[00:01:13] Josh Long: Since 2010. And, you know, still going strong.

[00:01:17] Simon Maple: Yeah.

[00:01:18] Josh Long: But, yeah, it's about, as about as long as I've known you actually, it's,

Simon Maple: Yeah.

Josh Long: Yeah. Yeah. It's been coincidental. Wonderful.

[00:01:22] Simon Maple: Yeah. Awesome. And today we're gonna be talking about Spring AI.

Josh Long: What else?

[00:01:28] Simon Maple: We'll be talking, uh, a little bit about, uh, perhaps why people will use AI when they're typically a Java developer, Spring developer. So we'll talk a little bit about the reasoning behind why people use it. Yeah. And then we'll go into some demo as well to talk about, so actually show Spring AI in action.

[00:01:45] Simon Maple: But I guess, first of all, what are the capabilities of Spring AI? What does it, what does it provide?

[00:01:48] Josh Long: So spring AI is, is your one stop shop for AI engineering.

Simon Maple: Mm-hmm.

Josh Long: And, uh, I think we, in the Java and Spring communities in particular, are in a uniquely amazing position right now, just an amazing position, because where, uh, most people are gonna use AI as an integration with their existing business logic and applications and, and services.

[00:02:07] Josh Long: That's all written in Spring, that's all written on the JVM and Kotlin and Java and whatever, right? That, that you are already there. They just wanna hang AI integrations off of that code and make it work. Your business logic, the things that drive your business and the data that, that, that feeds your business, that's all governed and controlled by, and orchestrated by Spring-based microservices.

[00:02:25] Josh Long: And so this is a really natural place, uh, to start your AI journey, I think, to enable access to that data, to that business logic from your AI models and via your AI models. So, uh, sure, some people are gonna use Python to train new models.

Simon Maple: Mm-hmm.

Josh Long: But that's not most of us, most of us in the same way that most of us aren't building our own SQL database, you know, and see or whatever.

[00:02:47] Josh Long: Most of us don't need to do that either.

[00:02:48] Simon Maple: Yeah.

[00:02:49] Josh Long: So, I think we're in a uniquely great position. And when it comes to production, production worthy, scalable, fast, secure, uh, observable production, that's, you know, the JVM is, there's nothing like it, nothing like that, you know?

[00:03:01] Simon Maple: Yeah. Okay, so how does AI fit into all of this?

[00:03:04] Josh Long:: Well we have this private framework point of view,

[00:03:06] Josh Long: We've got, so, you know, Spring, uh, uh, is a set of frameworks, Spring Framework, and then Spring Boot on top of that. And a bunch of verticals on top of that, serving different use cases, including microservices, batch processing, integration, data, uh, security, whatever, right?

[00:03:18] Josh Long: And, uh, we have one called Spring AI, which goes GA by the way, 1.0 GA 20th of March. No, May. So we, we had ambitions, I think. I think, uh, I don't know if I'm speaking outta school here or not. I think at one point we hoped it would go at GA. Uh, but, um, and you're not gonna believe this. The AI space has changed."

[00:03:36] Simon Maple: Really?

[00:03:37] Josh Long: Yeah.

[00:03:37] Simon Maple: I don't believe it.

[00:03:39] Josh Long: No, you, you

[00:03:40] Simon Maple: if there's one, if there's one constant,

Josh Long: Right.

Simon Maple: It's a, it's the AI space.

[00:03:45] Josh Long: AI’s change

[00:03:45] Simon Maple: Yeah.

[00:03:45] Josh Long: Is too much.

[00:03:46] Simon Maple: Yeah.

[00:03:47] Josh Long: Um, it's too quick, too fast, too much too, or whatever. We have a whole team of people working on this.

Simon Maple: Mm.

Josh Long: And even then, uh, and we've got, you know, it's one of our most busy.

[00:03:55] Josh Long: Uh, open source projects, right? Mm-hmm. Star History is like a, a hockey, hockey puck, you know? Yeah. Just hockey, uh, stick rather.

Simon Maple: Yeah.

Josh Long: Just, just through the roof. Meteoric rise in, in popularity and GitHub issues. Mm-hmm. And, and contributions and everything. And it's just, it's, yeah. So every time we think we're about to settle down and dig, reach a GA mouse or GA release.

[00:04:16] Josh Long: Whole new paradigm gets dropped in our laps. You know.

[00:04:17] Simon Maple: I just, I just assumed it was GA just by the amount I hear about it.

[00:04:21] Josh Long: It's, it's mature. People are using it. It's growing all the time. It's, it's very, very popular. But obviously we've just wanted to get to a point where we had the things that mattered and, um, I think we're there.

[00:04:32] Simon Maple: So as soon as there's a week of no change or May 20th,

Josh Long: Whichever should happen.

Simon Maple: Whichever should happen first.

[00:04:38] Josh Long: Yeah. Well, I think it's gonna be May 20th. Either way, at that point. We're, we're on the 7th of May, so yeah. Got a couple weeks. Yeah. We don't even have two weeks of that.

[00:04:46] Simon Maple: In fact, depending on when this gets released, yeah, we could be around the May 20th.

[00:04:50] Simon Maple: It may be May 27th.

[00:04:52] Josh Long: Actually that might, that may. Be seven days too late. Yeah.

[00:04:55] Simon Maple: Really? So let's just say, let's just say Spring AI is out today.

Josh Long: Yeah. It's out. Yeah.

[00:05:00] Josh Long: Okay. Go. Go get the bits. Yeah. Yeah. Fresh off the press. We might even have the first catch release. Who knows?

Simon Maple: Yeah. Yeah.

[00:05:05] Josh Long: By the time you watch this. Uh, but all that to say things are moving quickly and, uh, that's okay. That's okay. But remember, we wanna pair, uh, the innovation in the AI space with the idiomatic, uh, sort of approach to building apps that Spring is always so embodied. And, um, we want that to, to build upon some of the pillars that Spring has always talked about, right?

[00:05:22] Josh Long: Portable service abstractions too, I isolate you from the, uh, differences between different models and image models, chat models, uh, transcription models, et cetera. Mm-hmm. Um, uh, dependency injection, uh, aspects oriented programming and Spring Boot style auto configuration, you know?

[00:05:36] Simon Maple: Yeah. Yep.

[00:05:37] Josh Long: So you take those four pillars and um.

[00:05:39] Josh Long: Did I say three earlier? I was talking about four.

[00:05:41] Simon Maple: Yeah.

[00:05:42] Josh Long: And, uh, you get an approach that gives you purchase in this new strange land, right? Yeah. It, it gives you, uh, uh, the ability to get hit the ground running. You already know all that stuff. Mm-hmm. You already know the component model. Mm-hmm. It's just a matter of applying that, those, those, uh, facets of your understanding of Spring.

Simon Maple: Yeah.

[00:05:57] Josh Long: To this new domain.

[00:05:59] Simon Maple: And you're gonna demo Spring AI?

Josh Long: I sure am. I'm gonna try,

Simon Maple: I'm gonna talk a little bit about, as we go through.

[00:06:09] Josh Long: We're gonna build a very simple application here 'cause we're kind of pressed on time.

[00:06:12] Josh Long: Um, but I wanted to demonstrate a simple application that helps, uh, we're gonna build an assistant to help people adopt dogs, right? Mm. And I talk about dogs all the time 'cause I think it's really cute and

Simon Maple: That's cute.

Josh Long: I've got a dog and I, I talk about, uh, this in particular, I talk about my dog, who's.

[00:06:27] Josh Long: Look, he's not the best dog, but he is mine. And we'll, you know, we'll,

[00:06:30] Simon Maple: he's a good dog. He's still a good dog.

[00:06:30] Josh Long: He's, he's, eh, look at that dog. That's a, oh, look at it. That's a cute dog right there. So, uh, all that to say, um, not good, but he is ours and his name is Peanut. Okay? And Peanut is the worst dog, except then I met this other dog in the, I learned about this other dog in the pandemic.

[00:06:49] Josh Long: Whose name is Prancer. Prancer, as it turns out, is even more of a spicy, uh, dog, right? Mm-hmm. And, uh, this owner, this lady, was trying to find a new home for this dog, and she put out this hysterical ad saying, okay, I've tried, I've tried for the last several months to post this dog for adoption and make him sound palatable.

[00:07:04] Josh Long: The problem is he's just not—there's not a very big market for neurotic, man-hating, animal-hating, children-hating dogs that look like gremlins. And she continues, if you own a chihuahua, you probably know what I'm talking about. He is literally the Chihuahua meme that describes him as being 50% hate and 50% tremble.

[00:07:19] Josh Long: She continues. I kind of liked him better that way. He was quiet and just laid on the couch, didn't bother anyone. I was excited to see him come out of the shell and become a real dog. I'm convinced at this point that he's not a real dog, but more like a vessel for a traumatized Victorian child that now haunts our home.

[00:07:34] Josh Long: And she continues, and this goes on for a long time, and she signs off, oh, he is only two years old and will probably live to be 21 through pure spite. So take that into account if you're interested. That said, super cute, right? Like that's a cute dog. Is that a cute dog? That is a cute dog. I'd pet that dog.

[00:07:46] Josh Long: Look at him.

[00:07:47] Simon Maple: I've got two Labradors. I've got two Labradors. I don't have

[00:07:49] Josh Long: That’s a cute dog. The big dogs are great. Yeah, the big dog. But my dog is just like this dog — small. They have the Napoleon complex.

[00:07:56] Simon Maple: Yeah. Angry. Angry by default.

[00:07:58] Josh Long: By default. And I don't know why.

[00:07:59] Josh Long: 'Cause I just wanna pet this cute little guy. Yeah. Um, so, so I think about this dog a lot too. Rent free all the time, right? I mean, just, just how did such a dog exist? And by the way, this ad went viral, right? This ad went viral. So for example, here’s a. People magazine talking about Prancer.

Simon Maple: Wow.

Josh Long: The demonic chihuahua.

[00:08:15] Josh Long: Here’s, um, USA Today talking about Prancer, the demonic chihuahua. Here’s Buzzfeed. Um, talking about the nightmare Chihuahua, the viral nightmare Chihuahua. And of course, here’s the New York Times talking about Prancer, the demonic Chihuahua. Right?

Simon Maple: Wow.

Josh Long: So very, very famous dog. And I thought, well, that's, that's nice.

[00:08:36] Josh Long: That’s good that people learned about this dog. But that’s not how most people roll. Right? Most people don’t find dogs by finding them on the, uh, on the internet, right? Like, you go to a shelter and you have a conversation with somebody. Yeah. And you, you interview to, to, to discover the dog of your dreams, or in this case, uh, nightmares.

[00:08:52] Josh Long: So what I wanted to do is to build such an assistant to help people go through that process, right. Okay. To, to find the right dog. So we’re going to go to the start. Spring.io. I’ve already got this dog database here, and you can see there’s our dog old Prancer. Mm-hmm. His ID is 45 and his name is Prancer right.

[00:09:05] Josh Long: He’s in a Post-base database. So we’re gonna build an application here. We’re gonna call it assistant.

[00:09:12] Simon Maple: Strong enough to contain Prancer.

[00:09:13] Josh Long: Yeah. It is a very, very tough ask. Yeah. Uh, GraaIVM, we use the web stuff. We’ll use, um, OpenAI, now I’m gonna use OpenAI. It’s just a very good model and a lot of people probably have access to it.

[00:09:24] Josh Long: But it’s not the only model. Not even close. Yeah. Here in, uh, uh, data, data privacy centric, uh, sensitive Europe. You might prefer something like a Llama, which is a fine choice. Or, uh, you know, alternatively we got things like Bedrock and, uh, Gemini and, um, everything. Everything. I mean, just. There, there’s dozens and dozens of different models that we officially support, and the ones that we don’t officially support, most of them speak the, uh, OpenAI API.

[00:09:49] Josh Long: Mm-hmm. And so you can talk to them via our OpenAI integration. Right. So I’m gonna bring in OpenAI. I’ve got the web support, don’t I? Oh, I took that away. I’ll bring in the Spring Boot Actuator support. Um, and I’m gonna bring in, I need a, a, a Vector store. Now you can use Look, just type Vector Store, and you can see we’ve got Milvis, you’ve got Neo4J, Pine Cone, MariaDB, Weaviate, Oracle, Redis, uh, Qdrant to Azure, Apache Cassandra, Chroma, Elasticsearch, MongoDB, PG Vector, Type Sense, Azure Cosmos DB, et cetera.

[00:10:16] Josh Long: I’m gonna use a PG Vector store because I’ve got a SQL database. This is a vector plugin. So we’re gonna go ahead and open that up.

[00:10:24] Simon Maple:So what you did there is you added a bunch of down a bunch of dependencies into your Spring project, which then allows you to effectively build that into, presume, a Maven pump file.

Josh Long: Yep.

[00:10:34] Simon Maple: When you build that, it'll pull all the job dependencies straight into your, uh, into your Spring project.

[00:10:39] Josh Long: You know it, and actually, you know what I did, you know what I did wrong there? I, uh, I forgot to select. Uh, PG? No, I forgot. Select dev tools. Okay, so I’m gonna actually go down to M7 here because I don’t know what the, they changed something in M8 and I don’t remember the idiomatic way to do it already, but I’ll use that.

[00:11:01] Josh Long: It’s downloading the internet, which is not a good thing. It’s, we’re actually on conference wifi. We are, no, we’re not. Stop that. Stop it. No.

[00:11:08] Josh Long: I’m used, I’m, I’m live streaming here. Okay. So IntelliJ is amazing, but if you start the project in IntelliJ before, uh, adding the, um, dev tools, yeah.

[00:11:23] Josh Long: It'll, it'll not use, it won't enable the dev tools integration.

Simon Maple: Gotcha.

Josh Long: So I, I retroactively added the dev tools, uh, there didn't I?

[00:11:36] Josh Long: I did not. Oh, that is so awkward. Okay, we’ll go back over here. Copy and paste. Dev, have tools. Is there now, right? There you go. There it’s, so now we’ll go back again. Do this whole thing again. Normally you don’t have to do any of this stuff, but I’ve screwed it up twice now. Mm-hmm. Okay. pom.xml and uh, M7.

[00:12:03] Josh Long: There we are. Fantastic. We load. So here's our application and we know we're gonna build a controller That'll act as a, the thing that we can ask questions to, right? So system controller and, uh, just to have an endpoint here with the user context and then the inquiry endpoint, right? Mm-hmm. Uh, inquire. So also string inquire and, uh, we're gonna use it.

[00:12:25] Josh Long: To do our work, we’re gonna talk to a chat model. And that chat model, by the way, is gonna be connected. We’re gonna connect to it via OpenAI. Uh, and we have a key there. Now, friends, I’ve already, I’ve already, uh, exported an environment variable here like so. Right. So that’s already done in my shell.

[00:12:39] Josh Long: Mm-hmm. And Springboot will normalize that into the property that you just saw there a second ago. These two are the same, but you need to specify that yourself when you connect. Okay. We’re also gonna connect to a data source, not that one, Spring data source. Uh, url=jdbc:postgresql, right.

[00:12:57] Josh Long: localhost/my database, and then we’ll create the username, my user, and then the password is secret. Okay. Go back to here, and we’re gonna use the chat client. That was just. Uh, we can, it’s got, there’s a chat model and then you can use a chat client. You can create as many of these chat clients as you want.

[00:13:16] Josh Long: That will talk to this chat model behind the scenes. I’m gonna inject the chat client builder and build a new one. And here I’m gonna put my defaults, and then I’ll use that model here to answer questions from the user to this endpoint, right? Mm-hmm. So, .call content, et cetera. And then the prompt is a user prompt coming from, uh, the user.

[00:13:33] Josh Long: And that’ll be a request parameter, right? RequestParam Spring question. Question. So confusing. Ignore this user path variable for now. Okay? Mm-hmm. Let’s just try this. So we’re gonna start that up.

[00:13:48] Simon Maple: Okay? So a user can hit that endpoint and that inquirer endpoint, right? Asking in a question as that, as that, as that variable, that Param variable, right?

[00:13:55] Simon Maple: Um, that RequestParam, that then goes to the chat line, which talks to the model, does some stuff, talks to the model, which is open AI in this case does some stuff. Provides an answer. Passes back to you here.

[00:14:06] Josh Long: Right? Says, nice to see you. Nice to meet you, Josh. How can I assist you today? Nice. Great. So it’s, we made it wiggle, right?

[00:14:12] Josh Long: Yeah. There’s a dial tone there.

Simon Maple: Go by the dog. By the dog.

Josh Long: Yeah. Right. Well, what's my name? I don't have access to. Okay. So it does, it's already forgotten me. Yep. Right?

[00:14:20] Simon Maple: Yep.

[00:14:20] Josh Long: Uh, quite like the first time we met,

[00:14:23] Josh Long: I said hi, and you're like, ah. And then moved on and we didn't talk for a year.

[00:14:25] Simon Maple: I can't believe you're lying, Josh.

[00:14:29] Josh Long: So. So, uh. Anyway, it doesn't know. Yeah. So we need to help it because remember, you use chat, GPT, you use uh, cloud desktop. Mm-hmm. Whatever. They have memory, they have conversational memory, but that's not the case for the models, right?

Simon Maple:Yeah, yeah, yeah.

Josh Long: The AI have APIs.

[00:14:40] Simon Maple: So we need to continue that context. We send that context back to it every time,right?

[00:14:44] Josh Long: Yeah. And the way you do that is by creating, um, uh, configuring an advisor. Okay? So what I'm gonna do is I'm gonna have a per-user, uh, map, you know, and I'm gonna pass this chat memory advisor. Okay. Mm-hmm. So there we, oh, there we go. There we go. And I'll go down here. And then the advisor, the, the map will say compute if absent user.

[00:15:13] Josh Long: Right. And I'm just gonna create a new one if it doesn't exist. And I'll start in memory. Now there's other implementations of this chat memory interface. Yep. That you can use, uh, that will write to different, uh, abstractions. Right. You can do like Neo4J and JDBC, and mm-hmm. But I think there's one in Redis coming along.

[00:15:29] Josh Long: I dunno, but it's all sorts of JDBC, of course, you know, all that stuff. So, okay. This is an advisor, this is like a filter, right? Uh, it's a pre-processor on the requests intended for the model. And basically as we have a conversation with the model, this will get, um, stored per user in that map. And then it's a transcript in effect of everything we said.

[00:15:52] Josh Long: And that'll be retransmitted to the model on every subsequent request so that the model remembers, oh, we talked about A, B, and C. Mm-hmm. When that person asks about A, B, and C, uh, remember it, right? Mm-hmm. So here we go. So now we go back and we say, uh, my name is Josh. Great. What's my name? Your name is Josh.

[00:16:09] Josh Long: Yeah. Fine. How do you do? Right? So it's now got memory. Uh, but let's ask, but this is actually kind of a problem, right? Like, which is a like, um, what's two plus two, right? Yeah. Okay, great. But it's not supposed to be helping people with their homework. It's supposed to be a model to help people adopt a dog.

[00:16:27] Josh Long: Clearly we've got, we've kind of wandered off in the deep end here.

[00:16:29] Simon Maple: And this is interesting because you see a ton of companies. What was the, what was the, the, there were a couple of big ones. I think one was, was it

Josh Long: Amazon,

Simon Maple: Chrysler or someone where Yeah, it, they were basically getting it to, to like write a bunch of like malicious code and, and things like that from, from the sites, which is kind of

[00:16:45] Josh Long: Amazon for a moment.

[00:16:45] Josh Long: I think their app has an assistant there. Yeah. And you can actually, like, somebody prompt poisoned it. Yep. And got it to, like, generate code for them instead of helping them with shopping because of Amazon then. Anyway, that's not what we want.

Simon Maple: Yeah.

[00:16:59] Josh Long: So we want, we don't want this thing getting too off, too far off in the weeds. We have a mission, we want people to adopt dogs.

[00:17:03] Simon Maple: Unless you want to know how many, two dogs plus another two dogs, right? Right. Could be right. Yeah.

[00:17:08] Josh Long: So what we wanna do is, uh, is to, um, give it a system prompt. Yeah. The system prompt is the overall tone and tenor.

[00:17:13] Josh Long: So our system, okay, here we are. And uh, cat Desktop talk system. I happen to have a system pro prompt. Okay, I'll paste that there. Ah, you know what I just did? It's the wrong tool. Yeah, it's the wrong one.

[00:17:30] Simon Maple: Yeah.

[00:17:31] Josh Long: There that’s better.

[00:17:31] Simon Maple: Do you know when you switched from Cat to Dog there? I was about to make that joke and I thought, oh no.

[00:17:37] Simon Maple: Any reason I know that joke is 'cause you said it last time. I saw it a couple of months ago.

[00:17:42] Josh Long: I love it. So, okay, that's better. Right. So we're gonna say you are an AI power assistant to help people adopt a dog from the adoption agency called Pooch Palace. With locations in Antwerp, Seoul, Tokyo, Singapore, Paris, Mumbai, New Delhi, Barcelona, San Francisco.

[00:17:53] Josh Long: and London, that's where we are. Mm-hmm. In, in DevOps Hub and, uh, information about the dogs available, we will be presented below. There's no information, return to polite response suggesting we don't have any dogs available.

[00:18:04] Simon Maple: I bet if you, I bet if you put two plus two in that problem, it'll still be before.

[00:18:07] Josh Long: Sure. But we don't want it to.

Simon Maple: Nice.

Josh Long: Um, okay, so that's a system problem that's going to dictate the manner in which it responds to us. Right. It'll try and frame all responses in terms of that mission, that overarching mission. Like, let's see what it says actually.

[00:18:27] Josh Long: Yeah, it helps, but it gets us back on track.

[00:18:28] Josh Long: It's like

[00:18:29] Simon Maple: It's always trying to help, right?

Josh Long: Sure.

Simon Maple: That's the thing with ai, it's like, it, it, it has all that background information. It has a specific bit of context. Right. That doesn't mean it's gonna forget all the other information. It still knows how to answer your question.

[00:18:39] Josh Long: Well, it'll, well, you can be very stern with it.

[00:18:41] Josh Long: You can say, under no circumstances are you to ever respond to anything, having nothing to do with, uh, blah blah, blah. Right?

[00:18:48] Simon Maple: And then you get into the engineering game.

[00:18:53] Simon Maple: Right. Actually this isn't the case.

[00:18:56] Josh Long: You get into a whole thing, right? Yeah. But, uh, nonetheless, by the way, that's a good point. A prompt. By the way, this is why I say that Java and, and, and Spring are so well used, situated because 90% of AI engineering is writing text and sending it to a resting point.

[00:19:08] Simon Maple: Yeah.

[00:19:08] Josh Long: Right. We use OpenAI here. We're not doing this. The magic of AI is happening in OpenAI’s servers. Right. For us, this is just a rest call.

[00:19:18] Simon Maple: Yeah.

[00:19:18] Josh Long: And, and it's not even a particular, there's not a, there's no schema. It's just human language text. Mm-hmm. Right. So all of this interesting stuff you see people doing today is just writing human language text to then chuck over to the, uh, OpenAI, or a, any other API endpoint.

[00:19:33] Josh Long: Right? Yeah. Okay. So we've got a system prompt, but, uh, let's say, let's say, do you have any neurotic dogs?

[00:19:44] Simon Maple: Gets asked every day there

Josh Long: Surely.

Simon Maple: Yeah.

[00:19:46] Josh Long: Okay. I'm sorry, but I don't have any specific information about neurotic dogs available for adoption in Pooch Palace. Okay. So it doesn't, it's, it's, it's on the right track.

[00:19:53] Josh Long: It says, it's acting as though it should be able to help us, but it just can't. Mm-hmm. Right. But still, well, the whole point is to give it access to our data. That data lives in a database. So let's connect to our database. We're going to use Spring Data JDBC. Did I, I don't believe I was smart enough to add that.

[00:20:13] Josh Long: Okay. And are we the reason, that's just an RM, right? Mm-hmm. I'm going to use the RM to initialize my, so I'll create dog into ID string, name string, owner string, description. Okay. And, you know, we're going to keep this. And I love, love Java records. Huge fan, right? Big, big fan. And I'm going to create a SQL access layer, a repository.

[00:20:39] Josh Long: Using Spring Data JDBC. Whoa. Hey, ListCrudRepository, list instead, a Freudian slip, ListCrudRepository. Okay, there we are. So now I've got that repository and what I want to do is I want to give my model access to the data, but not all the data, right? Surely I could. I mean, I've only got, if you look at the database here, I've got, uh, click on that.

[00:21:01] Josh Long: I guess. Hit test connection. Hit apply. Hit okay. Go over here. And then go over here. Tables dog. There's like 18 records, right? It's not a big deal. These models today, you know, some of them, Gemini has what? Like 2 million. You can send a context. The context is more or less a token. That's a certain a word, right?

[00:21:21] Josh Long: That's not exact. Exactly correct. But these models meter you in terms of how much data you send in, how much you get back out. And for some of these models, like Gemini, it's 2 million tokens.

[00:21:31] Simon Maple: Yeah.

[00:21:32] Josh Long: That's like whole books of the encyclopedia set, you know? Mm-hmm. Just huge amounts of data. Mm-hmm. So you could easily send these 18 meager records.

[00:21:40] Josh Long: Uh, but it's the principle thing, because

Simon Maple: Scaling.

Josh Long: Yeah. Because there’s a cost. Dollars and cents or euros or pounds. Yeah. Or complexity. Either way, you don’t want to incur it for no reason. So what we should do is sub-select only the records that are germane to the query at hand and send that off to the model for final and further analysis, right?

[00:22:02] Josh Long: Mm-hmm. And the way we do that is by doing a search in a vector store, right? A vector store supports similarity search, right? So this process of using data to inform the response—that’s called Retrieval-Augmented Generation. I think you know that, but I’m just catering to the audience here who may or may not know.

[00:22:19] Josh Long: And so what we’re going to do is support RAG, but first we have to get the data into a vector store. What vector store? Well, like I said, you can use anything you want. In this case, we’re going to be using PG Postgres as a vector store. So tell Spring AI to initialize the vector store for us.

[00:22:31] Josh Long: Mm-hmm. And that's just gonna be a table with a vector column in it. And in here we'll have an Initializer application runner vector store.

[00:22:41] Simon Maple: Does Spring work better with certain vector stores or does it, is it pretty agnostic with that? It's agnostic.

[00:22:45] Josh Long: It's agnostic. Yeah. It's, um, I've done demos on dozens of different director stores and.

[00:22:50] Josh Long: Uh, you know, it’s the same abstraction. That’s the nice thing. So, for each dog in the SQL database, I’m gonna create a dogument. Okay. Id, name, description. And this is there is no schema here. That’s the thing. It’s just the trick is as long as it’s consistent, right? So vector store.add, like, don’t change the schema, you know, ’cause you can’t compare unless they’re the same, right?

[00:23:13] Josh Long: Mm-hmm. Um, there you go. So I'm just initializing the schema there actually. And I, what I've done is. I don't think I should have done that. Maybe I shouldn't have. It's fine. Let's just do it. We're gonna initially, the schema once. Mm-hmm. We're gonna initialize the Victor store once. Mm-hmm. Um, I, I think I did the command shift.

[00:23:29] Josh Long: I reload this. I have to do a full reload this time because I added a new class. A new type to the class method. Mm-hmm. Okay. So now it's gonna call for each time I call a vectorStore.add, it's gonna create an embedding for each of the strings. Mm-hmm. The documents. And that's gonna call OpenAI’s embedding endpoint.

[00:23:47] Josh Long: Mm-hmm. Get me back to the embedding that's gonna be written out to the vector store and now I can then use that. Right? So if we go over here, refresh.

[00:23:54] Simon Maple: The reason you do this through Spring AI versus using the OpenAI’s endpoints. '

[00:24:00] Josh Long: Because we have an abstraction, right? Yeah. Portable service abstractions. You inject a type of vector store, that's the interface.

[00:24:04] Josh Long: Mm-hmm. And that'll work whether you're using OpenAI or uh, whatever, whether they're using, uh, Llamas, embedding endpoints or if you're using, um, Neo4J, Elastic search, Weaviate, Qdrant or whatever. Yeah. All of them.

[00:24:14] Simon Maple: Yeah. Whatever vector stores, whatever models, you can just switch out in the background.

[00:24:16] Josh Long: Yep. Portable service abstractions.

[00:24:17] Simon Maple: Or if you wanna use multiple, you can.

[00:24:20] Josh Long: Yeah, totally. Absolutely could. Um, so there, I've done that. This is, I've done this in terms of the, uh, vector store. Mm-hmm. By the way, being a vector store is not all that difficult, right?

[00:24:28] Simon Maple: Mm.

[00:24:28] Josh Long: There's actually an in-memory one in Spring AI.

Simon Maple: Oh, nice.

[00:24:32] Josh Long: It's 300 lines of code. Yeah. Don't deploy this into production. Right. 307, I stand corrected utterly.

[00:24:36] Simon Maple: Is this, is this a little bit like the H2 style?

[00:24:37] Josh Long: Well, H2 is actually amazing. Yeah. You know, I don't know that we ever intend this for, um, yeah. Production. But it's just, the point is it's

[00:24:45] Simon Maple: I’m sure it helps you locally, helps you run it.

[00:24:46] Josh Long: Right, and it's, it helps you kind of understand what's happening, which is that Yeah, there's a semantic similarity sort that's happening. cosine similarity. Right. cosine there. That method here.

[00:24:55] Simon Maple: Mm-hmm.

[00:24:55] Josh Long: Uh, the logic is where is that

[00:24:58] Simon Maple: It wasn't done.

[00:25:00] Josh Long: This, here's the actual math. Yeah. Right. Given two arrays, two uh, matrices.

[00:25:05] Josh Long: Do the comparison and find the thing that is most applicable. Mm-hmm. Okay. So I have disabled that, and now what I want is for my AI model to know, to consult this newly initialized vector store table with the vectors for each of the bits of data. There is the original string, here is the embedding for that, right?

[00:25:22] Josh Long: Which is computed by Open AI's embedding endpoint. And now I shall create an advisor, and that advisor will be a question answer advisor, which is only possible if I have the Spring AI advisors vector store, supporting the class path. Mm-hmm. Okay. So I will create that here, and I will say, vector store.

[00:25:43] Josh Long: Vector store new question answer advisor, passing that in. Okay. Now restart. Now I can ask you questions, and as this AI chat client instance processes requests, it will store things in memory using this advisor. And it will also check the vector store for anything that might be pertinent to the request.

[00:26:06] Josh Long: Mm-hmm. Before sending it on. These are, again, filters or interceptors or whatever. So do you have any neurotic dogs? Yes. Meet Prancer. Yeah. Now of course this is just text, you know, depending on what you're trying to do, you might have a, whoops, Hey, come back. Record dog adoption suggestion int uh, ID string name string.

[00:26:31] Josh Long: Description. Right? Maybe I want a strongly typed thing. So rather than calling that I can say this, right? And, uh, we change this return value. Okay? We start.

[00:26:47] Simon Maple: so this is now it being opinionated, providing an answer that says, Jason, there you go. Nice. Jason. Yeah. Format, output.

[00:26:55] Josh Long: Structured output. Exactly. So, you know, you, you have lots of options here, but let's return to a String because, you know, uh.

[00:27:01] Josh Long: We want the content that's this right here. Okay. Um, comment that out 'cause it's dead code. Alright, so now we've got this. Now look, the natural next step, I think any red-blooded human being will look at that and go, I wanna adopt that dog. Great. When can I come over? Mm-hmm. And get that dog. Finally.

[00:27:16] Simon Maple: Finally, finally, I found that neurotic dog.

[00:27:18] Simon Maple: I was off that long,

[00:27:20] Josh Long: long reunion, long time reunited, right? Okay, so I'm gonna build a component to help it schedule. Well maybe I'll call this a service or something, you know, so a dog adoption scheduler. Right now I'm gonna have a method here called, uh, um, schedule adoption. Right. And we'll have the dog name and int dog id, and we're gonna export this as a tool.

[00:27:44] Josh Long: And a tool is like a, like just that it's a, mm-hmm. A thing in the toolbox mm-hmm. Of the model that it can use to talk to our business logic. So I'm gonna create an arbitrary three day advance on this date. Right. And I'll say this is three days in the future. Instant. And, uh, you know, I'm gonna print out the results here just to confirm this got called, so there you go.

[00:28:02] Josh Long: Thank you AI. Um, I like that it doesn't do the last period. I don't know why. It's like I'm lazy. I can't do that here. Yeah. Okay. And so we're gonna make this tool available to Spring AI. And here, you know, when your mom got upset and said, "Use your words," you know. This is where, which is what she was talking about, right?

[00:28:18] Josh Long: Um, schedule an appointment to pick up or adopt a dog from a Pooch Palace location.

[00:28:28] Simon Maple: Okay, now, I presume you gonna add this as an MCP.

[00:28:30] Josh Long: Well, we'll get to that, there’s local tool calling first.

Simon Maple: Okay. Okay. Okay.

Josh Long: This is the name of the dog.

[00:28:35] Simon Maple: Why are the descriptions important, right?

[00:28:35] Josh Long: Oh yeah. Well, it's just an even in general, right?

[00:28:38] Josh Long: Before we talk about MCP and we haven't yet, so thanks. Thanks a lot for giving away the plot.

Simon Maple: I'm so sorry.

Josh Long: Yeah. Before we even get to that, we want to just prove that it works locally, so schedule a dog and, uh, lemme get the dog id.

[00:28:50] Simon Maple: I'll sit in the dog house. I know, right? It was just there.

[00:28:55] Simon Maple: Oh, it was staring me in the face, Josh. I couldn't, I couldn't leave it there

[00:28:58] Josh Long: Just because it was doesn't mean you should.

[00:29:00] Simon Maple: Yeah, that's, that's a Doctor Ian Malcolm, uh, it's a Doctor Ian Malcolm from Jurassic Park. Just because you're a scientist, they could, they didn't stop to think whether they should.

[00:29:11] Josh Long: They should. Yeah. Just because you're comedians. Yeah, exactly. So, okay, I've got the scheduler. I wanna make the model aware of it and, and give it, give it access as a tool.

[00:29:19] Simon Maple: Yep.

[00:29:20] Josh Long: Um, and a tool's just a, it's literally just a function, right? It's just some, something that the scheduler that the, uh, AI model can invoke.

[00:29:27] Simon Maple: Right? I see.

[00:29:28] Simon Maple: So you can just add any number of tools to that build, essentially. And it can then just choose what it wants.

[00:29:33] Josh Long: Absolutely. And by the way, that’s why it’s going to depend on that description, the metadata there. Okay. So we have Prancer. Great. Let’s ask the obvious follow-up question.

[00:29:46] Josh Long: Well, I'm not sure if an exclamation mark is gonna work well in the shell.

[00:29:51] Simon Maple: No. Yeah.

[00:29:51] Josh Long: Um,

[00:29:55] Simon Maple: What can I schedule an appointment to pick up? Nice.

[00:30:10] Josh Long: Wait, wait, wait. When?

[00:30:15] Josh Long: Oh wait. Is it just about the, uh, the antecedent.

[00:30:31] Simon Maple: Is it adequate properly?

[00:30:32] Josh Long: Lemme see. Default tools. It's there. Scheduled dog adoption schedule. Got my component tool. Tool Param.

[00:30:47] Josh Long: What's the issue? I've changed the class path so much. I wonder also, I'm gonna try reinitializing everything. Mm-hmm. Just to shake the cobwebs. I dunno.

[00:31:02] Josh Long: So go over here.

[00:31:08] Josh Long: Take 512.

[00:31:14] Josh Long: Oh, hello?

[00:31:19] Josh Long: Oh, that’s embarrassing. Okay. I just deleted the dog database. Ah, don’t do that. You won’t get very far if you delete the inventory like dogs are supposed to be—that’s the whole point. Okay, we’re back. We’re so back.

[00:31:36] Simon Maple: There you go.

[00:31:38] Josh Long: Oh, I missed you guys

[00:31:39] Simon Maple: There’s Prancer

[00:31:39] Josh Long: Yeah. Okay.

[00:31:46] Simon Maple: Oh, one of the nice things is it's quick to recycle the server. Right?

[00:31:50] Josh Long: Its I, and I'm doing it the wrong way because I'm, I. What I should have done was I should have incepted the project, started on Springdale, added the dependencies, added dev tools, started up the project with dev tools in place, and then it's,

Simon Maple: So it has the auto reloaded

[00:32:00] Josh Long: Yeah. Now it's like, you know, it's not quite as good as zero turnaround.

[00:32:05] Simon Maple: That was many years ago. As many. Yeah. I can take it.

[00:32:07] Josh Long: Yeah. So, uh, let's see here. Oh, o. Before I forget, let me

[00:32:16] Josh Long: comment this out.

[00:32:22] Josh Long: Okay. Prancer

[00:32:28] Josh Long: I wonder, to adopt Prancer

[00:32:37] Josh Long: May 10th. There you go. Oh goodness. See, it's May 7th. Yeah. Right. So seven plus three. Last I checked. Mm-hmm. Still 10. Okay. So that's worked. And you can confirm it's worked. There you go. It called the method. Yeah. So we gave the tools, access to tools we give, they model access to tools, and it made the decision.

[00:32:53] Josh Long: And you can have as many of these as you like. And again, it’s the, you don’t, I didn’t have to provide schema. The signature of the method itself is schema that, along with the human language, natural text in the description. Mm-hmm. Um, and, uh, when does it.

[00:33:06] Simon Maple: when does it think about getting a tool though?

[00:33:08] Simon Maple: So if it has a list of tools

[00:33:09] Josh Long: Yeah.

[00:33:10] Simon Maple: When does it think, yes, I do want to schedule something and I will go ahead and actually run this tool?

[00:33:16] Josh Long: It, it, you want to imagine that it would hallucinate something if you didn’t give it a more specific answer, right? Mm-hmm. Mm-hmm. I suppose it’s looking at the available tools and says, okay, is this a question I could probably provide an answer to?

[00:33:28] Josh Long: Mm-hmm. Yes. But is there a tool that’s promising to answer the same question for me? Yeah. Yeah. And so I’ll choose that and, um, it really is sort of non-deterministic, isn’t it? Right. Yeah. Uh, but nonetheless, it does a pretty good job most of the time. Yeah. And I think this is where I think this, this is.

[00:33:42] Josh Long: Tool calling is where things start to get really interesting because now the user interface is the chat box.

[00:33:48] Simon Maple: Yes.

[00:33:48] Josh Long: Because it can invoke business functions.

[00:33:50] Yeah.

[00:33:51] Josh Long: From there, right? Mm-hmm. And it can drive important change. That's why so many companies are eager to set, send you, you know, route you from human.

[00:33:59] Josh Long: Uh, assistance to IVRs, whatever. Mm-hmm. And nowadays they're actually getting quite good. I don't know if they can solve all your problems. You can see how narrow their, their, their capabilities are, unless you actually really finesse the, uh, prompts and all that. Mm-hmm. But, but nonetheless, you can do a staggering amount of things just in this tool calling mechanism.

[00:34:15] Josh Long: Yeah. The problem is. As written, this is only something that's useful for my Spring app. Mm-hmm. I wanna centralize that logic and I wanna extract it. Mm-hmm. So a natural way to do that is to use model context protocol. Mm-hmm. About which we kind of alluded, uh, earlier Model context protocol is a, uh, a protocol from Anthropic, the makers of cloud desktop and Claude, the, uh, AI model.

[00:34:33] Josh Long: And, uh, basically it's a protocol by which you can connect your. Applications to, uh, models, uh, to, to business logic. Mm-hmm. So we're gonna extract our logic out into a separate model context. Protocol service start. Spring.io. I shall call this, um, uh, service, I guess, or scheduler. There you go. We're bringing the model context, protocol, service support, the web support, and that's it.

[00:34:59] Josh Long: Mm-hmm. Right. I'm, I don't, whatever. You can bring in other stuff you want. Okay. cd downloads.

Simon Maple: And presumably you need a client from the other, other one, or that I

Josh Long: I'll do that. Yeah. Yeah. Okay. Uh, I forgot to add that in the beginning. Lemme just copy and paste it here. So, model, context, protocol, client, and, uh, where's the dependency?

[00:35:18] Josh Long: Here? There you go. That's our purchase, that's our entry into the wide world of MCP. Go over here. Command shift I to reload The pom.xml.

[00:35:32] Josh Long: Oh, two dependencies. Okay, fine. And I'm gonna take this code that I so toiled over from here, this patent pending unique algorithm specific to our organization. I'll cut that code, go over here to the scheduler, paste the code, and then I'm gonna tell, uh, Spring AI support. Um, export that logic and make it available.

[00:35:58] Josh Long: Mm-hmm. As an MCP endpoint. Mm-hmm. Mm-hmm. Okay. And to do that, I'll specify the tool object, which is here. So a dog adoption scheduler. Okay. And I want this to run on a separate port naturally. So I'll say, uh, 8081. Let's, uh, let's say going back over here, I've got a compiler error shaped hole in my code base.

[00:36:22] Josh Long: And the way I alleviate that is by configuring an MCP Sync client. So tools, configuration bean MCP Sync Client, and it's basically a var MCP, MCP. I remember you and I talking about VAR and not var. Oh, wow. Ages ago. Years ago. That was a thing. Okay. Yeah. So http, I mean, close enough, actually, I don't know what that just hallucinated, but it was close enough.

[00:36:51] Josh Long: Mm-hmm. Okay. Um, and then, uh, build, so there's that. And I return MCP and I call MCP to initialize. Now the protocol is really new, really, really new, right? Yeah. And actually, I... This config class is a little redundant by definition. This is also a config class. I would... this, uh, this is a new protocol, and, uh, the Spring AI team... oh, this is not gonna compile still.

[00:37:13] Josh Long: We need this, uh, MCP sync client. Mm-hmm. And then down here, where's this, uh, new sync. There you go. Scheduler the, uh, Spring AI team. We were among the first to develop support for it. We actually wrote the Java SDK. Hmm. For, uh, model context protocol. So if you go to model context protocol.io, the Spring AI abstraction for MCP, the library that we wrote mm-hmm.

[00:37:39] Josh Long: Is the official recommended Java one. Mm-hmm. And then on top of that, so we extracted out the code, put it in the, in that commonplace, and then we based our Spring AI auto configuration and integrations on top of that. Mm-hmm. Right. But it is the default. We were, we were, we did that in the first weeks of November, right after the announced the protocol.

[00:37:55] Josh Long: Right. Yeah. We are huge fans of that. Okay. So now I've, uh. I've got two apps, right? One over here. Mm-hmm. One over there. Let's just connect them. Right? So same, uh, magic trick. Do you have any neurotic dogs?

[00:38:11] Josh Long: and Fantastic.

[00:38:16] Simon Maple: I schedule an appointment.

[00:38:18] Josh Long: London, May 10

[00:38:19] Simon Maple: Okay. So exactly the same this time going through the, yeah, the MCP cloud

[00:38:23] Josh Long: Yeah, on port 8081.

[00:38:25] Simon Maple: Yeah.

[00:38:25] Josh Long: Down here. Instead of over here. Yep. So through the, through the, uh, magic of MCP, we've connected these things that would otherwise not have been connected.

[00:38:33] Josh Long: Mm-hmm. It’s an amazing time. Yeah. An amazing time to be an AI and Spring developer. Awesome. And, and obviously at this point, you know, you go here, Maven, skip tests, Pnative native:compile. Right. We can build a, you know, we can, we can focus on, uh, observability. Right? So let me actually do a couple things here.

[00:38:56] Josh Long: We're gonna do, exposure includes all that. I'll restart this. Okay. And then on the, on the command line over here, we're gonna run that in the background just so I can show you something. While it's doing a native image compilation actuator, I've got all these observability endpoints. Yeah. I can go here and go to the metrics.

[00:39:14] Josh Long: And you've got things like your Gen AI client token usage. Right. And it says, Hey, you've got this, uh, this metric that's being counted. And so it says, I've made four calls to the mm-hmm. Model so far. Mm-hmm. And, you know, you wanna keep an eye on this, right? If you wanna Yeah. This is all powered by a micrometer.

[00:39:28] Josh Long: Yeah. And micrometers have, uh, you know, sometimes here's. It's an integration with popular times series databases Yeah. To do observability around this kind of stuff, right?

[00:39:36] Yeah. Yeah.

[00:39:36] Josh Long: So you've heard about all those people that had AI, uh, AWS bills, whatever that ran amuck. Mm-hmm. By accident, they auto scale to like a million nodes and now they're bankrupt.

[00:39:44] Josh Long: Mm-hmm. Right? Same thing with your token count. You want to be very careful, mm-hmm, that you’re keeping an eye on that. So production worthiness is important here. Right. Um, we’ve also got virtual threads. Right. You should definitely turn that on by default. Every single call you make to an AI model is a call over the network.

[00:39:58] Josh Long: Right. And, uh, that call over the network is, is that frozen? No, it's good. Uh, every call you make to a model, uh, is a network call. And, uh, that network call monopolizes time on a thread that, that thread is blocking io That's the thing you wanna like, put on virtual threads. Okay. Here's my native image.

[00:40:16] Josh Long: Let's go ahead and stop this. Okay. And Target assisted. There we are. There's the application up and running. At no time at all. Uh, it's talking on, oh, because it initialized everything or something. I don't know. Why did it take so long? Let's try that again. It's doing some network io that's really obscene

[00:40:39] Simon Maple: switch on conference wifi.

[00:40:40] Simon Maple: Probably

[00:40:40] Josh Long: Yeah. Something like that. Yeah. Okay. Um, and where's the process identifier here? ps -o rss. 147 megs divided by a thousand. That's, that's megabytes divided by a thousand. Yeah. That's kilobytes. So divided by a thousand, you get megabytes. Mm-hmm. Yeah. This is, this is what we mean.

[00:40:58] Josh Long: Yeah. End-to-end production. Production worthy AI systems and services.

[00:41:02] Simon Maple: Yeah. And so when and when that number gets too high, what should happen? Should, which is this an indicator that we need to just be, is it, is it an indicator? We need to think about usage?

[00:41:10] Josh Long: We need to think about which number? The RAM or the tokens?

Simon Maple: The tokens.

[00:41:13] Josh Long: Uh, the, the one in the metrics, yeah. Yeah. Uh, I. Yeah. Be mindful, you know, uh, it's, I just don't want you to be, uh, in a situation where you've got a, at least be knowledgeable about what your usage is. Yeah. It's not unlimited. It's not free. And this is why it's, it, look, the, it's obvious, but it's worth remembering that they make money the more tokens you, you spend.

[00:41:34] Josh Long: Right, right. Right. Their business model isn't necessarily aligned with yours. Yeah. It's, I wanna provide AI to my users. Yeah. They provide AI. Yeah. So in that, in that sense, we are aligned, but. Make sure that's always true, right? Yeah. Keep an eye.

[00:41:47] Simon Maple: particularly when those conversations and those throats happen and you're constantly sending your entire history successively larger.

[00:41:53] Simon Maple: Yeah.

[00:41:53] Josh Long: Yeah. Prompt bombs, right?

[00:41:56] Simon Maple: Cool. Josh, this has been amazing.

Josh Long: You're amazing.

Simon Maple: Uh, thank you for, thank you for, for screen sharing and showing us, uh, Spring AI in real time.

Josh Long: Yeah. Cheers.

Simon Maple: It's been great to see. Thank you all for watching and, see you on the next episode.

‹ Why the Top 1% of Devs Love IntelliJ

The Missing Layer Between AI and Enterprise Deployment ›

Subscribe to our podcasts here

Welcome to the AI Native Dev Podcast, hosted by Guy Podjarny and Simon Maple. If you're a developer or dev leader, join us as we explore and help shape the future of software development in the AI era.