Episode Description
Join Simon Maple in this hands-on episode as he continues his conversation with Devin Stein, the innovative mind behind Dosu, an AI-driven tool designed to assist developers with tasks outside the traditional IDE. Devin, a seasoned software engineer and entrepreneur, shares how Dosu helps developers by answering questions, triaging issues, and maintaining documentation. The episode explores real-world examples of Dosu in action, highlighting its successes and the critical role of human oversight in ensuring accuracy. Devin Stein, with his extensive background in AI and machine learning, discusses the practical applications of Dosu within the open-source community and beyond. Learn how Dosu automates responses, organizes issues, and integrates data from various sources to provide comprehensive and accurate solutions. The conversation also touches on the limitations of AI and the essential balance between AI capabilities and human oversight.
Overview
Introduction
In the ever-evolving landscape of software development, the integration of Artificial Intelligence (AI) is reshaping how developers approach problem-solving and enhance productivity. In this blog post, we explore insights from a recent podcast featuring Simon Maple and Devin Stein, the founder of Dosu. This conversation delves into the practical applications of AI within software development, particularly in open-source environments. We'll cover how Dosu serves as an AI engineering teammate, streamlining workflows, assisting with issue triaging, and ultimately improving collaboration among developers.
Overview of Dosu
Dosu is introduced as an innovative AI tool designed to assist engineers with work outside their Integrated Development Environment (IDE). Devin Stein elaborates on how Dosu functions as a supportive teammate, handling repetitive and time-consuming tasks such as answering questions, triaging issues, and maintaining documentation. This allows engineers to concentrate on what they love most: coding.
Dosu has gained popularity within the open-source community, where it aids maintainers by:
Answering user questions effectively.
Triaging issues to identify the root cause of problems, whether they stem from user error or recent code changes.
Guiding users on how to contribute to projects, enhancing community engagement.
By automating these processes, Dosu provides significant value to developers, ultimately reducing response times and enhancing user experience. As Devin states, "Dosu is an AI engineering teammate that helps engineers with work outside of the IDE," highlighting its role in empowering developers to focus on coding rather than administrative tasks.
The Happy Path: AI in Action
In the discussion, Simon and Devin illustrate the ideal scenario where a user encounters an issue with a known resolution. This "happy path" demonstrates how Dosu can drastically reduce the time maintainers spend addressing user queries.
For example, Devin recounts a case from Apache Superset, an open-source Business Intelligence tool. A user reported a potential bug related to PDF files in charts. Dosu analyzed the issue, automatically labeled it for better organization, and determined that the user's problem stemmed from user error rather than a fault in the code. By guiding the user on how to achieve the desired behavior, Dosu not only saved the maintainer time but also empowered the user with the right knowledge.
The ability of AI to understand and process vast amounts of information allows it to provide accurate and timely responses, making it a valuable asset in the software development process. As Simon notes, "The potential impact for unblocking users and saving maintainers time is pretty significant," emphasizing the efficiency with which AI can operate.
Dealing with Unknowns: AI’s Problem-Solving Capabilities
However, not all scenarios present clear-cut solutions. The podcast transitions into discussing cases where documentation may be lacking, and Dosu must derive answers from the code itself. Devin highlights a compelling example involving Jaeger, a tracing framework for distributed services.
In this instance, a user posed a nuanced question regarding the timing of parent and child spans. Given the specificity of the question, traditional documentation may not exist. Nevertheless, Dosu leveraged test information to provide an accurate answer. By analyzing relevant tests that assert specific behaviors, Dosu could clarify that the duration of child spans is not always contained within the parent span. This demonstrates how AI can bridge gaps in documentation by accessing the source code and relevant tests, making complex information more accessible.
As Devin aptly explains, "Code is really the source of truth for how your product works," illustrating the importance of AI's ability to navigate and analyze code effectively.
The Importance of Testing in AI Solutions
The discussion leads to another critical insight: the importance of testing as a source of truth in software development. Simon and Devin emphasize that well-written tests clarify expected behaviors, making them invaluable for AI tools like Dosu.
When AI algorithms reference tests, they gain a more authoritative understanding of the system's intended functionality. Tests serve as a definitive guide, helping AI provide accurate responses and solutions. As Devin aptly states, "Tests make it much easier to reason about what the expected behavior is." This reliance on testing enhances the reliability of AI-assisted solutions.
Proper testing not only aids in ensuring software quality but also provides a reference point for AI, allowing it to make informed decisions based on established behaviors. This highlights the synergy between rigorous testing practices and AI functionalities.
AI Limitations: When Things Go Wrong
Despite the advantages, the conversation also addresses potential pitfalls of AI in software development. Devin presents a scenario where Dosu encountered challenges while assisting with deployment issues in the open telemetry repository.
In this case, a user struggled to deploy a service using Helm, a Kubernetes package manager. Dosu analyzed the situation and suggested a workaround based on code analysis. However, this recommendation was slightly misleading, as the actual issue was that the user's Kubernetes version was unsupported.
This example underscores a common limitation of AI: the tendency to prioritize finding a solution over recognizing when no solution exists. Devin notes that "LLMs have a bias towards solution," which can lead to oversight in critical situations. It highlights the importance of human oversight in AI-driven development, ensuring that maintainers can intervene when necessary.
Human Oversight and AI in Development
The importance of human intervention cannot be overstated. Simon and Devin discuss the need for AI to recognize when to escalate issues to human maintainers rather than attempting to provide a solution at all costs. This approach ensures that AI complements human expertise rather than replacing it.
As AI continues to evolve, striking a balance between automation and human oversight will be crucial in maintaining the integrity of the development process. Developers must remain vigilant in monitoring AI outputs, particularly in complex scenarios where nuanced understanding is essential. As Simon emphasizes, "Getting that human in the loop is important," reinforcing the idea that AI should not operate in isolation.
Future of AI in Open Source and Development
In concluding the episode, Simon and Devin reflect on the transformative potential of AI tools like Dosu in the open-source landscape. The ability to streamline workflows, improve response times, and enhance collaboration can significantly impact both maintainers and users.
Devin encourages listeners to explore Dosu for themselves, highlighting its applicability not only in open-source projects but also within enterprise environments. As organizations increasingly adopt AI solutions, understanding their capabilities and limitations will be essential for maximizing their benefits. "The easiest way to get started is go to our website, dosu.dev," Devin suggests, making it clear that there are accessible pathways for developers to engage with this technology.
Summary/Conclusion
In this blog post, we explored the key insights from the podcast featuring Simon Maple and Devin Stein about the role of AI in software development. Key takeaways include:
Dosu as an AI engineering teammate that enhances productivity and streamlines workflows.
The significance of the "happy path" and effective problem-solving capabilities in known and unknown scenarios.
The critical role of testing in providing accurate information and guiding AI applications.
The potential pitfalls of AI, emphasizing the need for human oversight in decision-making processes.
The promising future of AI tools in transforming open-source development and enterprise environments.
Resources
Chapters
[00:00:15] Introduction to the Episode and Guest
[00:00:39] Overview of Dosu
[00:02:24] The Golden Path: User Runs Into an Issue
[00:05:14] Integrating Various Data Sources
[00:06:40] Beyond Documentation: Digging Into Code
[00:09:03] AI’s Limitations: When Things Go Wrong
[00:11:17] Trust and Human Oversight in AI Solutions
[00:12:24] Value to Open Source and Internal Teams
[00:12:29] Getting Started with Dosu
Full Script
[00:00:15] Simon Maple: Hello and welcome to the hands on part of this episode. Joining me again is Devin Stein from Dosu and if you hadn't seen, if you hadn't listened to or seen the previous episode, we talked a bunch about how AI can help in various cases, including,in PRs and in, chatting about issues and helping triage and things like that, as well as identifying and looking through code bases with varying levels of context.
[00:00:39] Simon Maple: And we're going to go now, very hands on now. We're going to screen share and show a few things, that Devin's going to, show us, demoing, Dosu in the meantime. We're going to first of all look at the green,the happy golden path, what happens when a user rush runs into an issue, they raise an issue,
[00:00:54] Simon Maple: there's actually an answer that's available. How can AI help us reduce our time as maintainers, but also help them get to a faster answer quicker? Then what happens when actually there is no known good answer? Can AI help us identify what that right answer is? And then we're going to go into a more of a slightly bad case of what happens when AI actually does the wrong thing, and where we can actually still improve going forward.
[00:01:23] Simon Maple: Devin is the founder of Dosu. Devin, why don't you give a very brief intro into Dosu before we jump into a screen share.
[00:01:31] Devin Stein: Thanks Simon. Yeah. So Dosu for those who don't know, is an AI engineering teammate that helps,engineers with work outside of the IDE. So they're trying to answer questions, triage issues for them, maintain documentation, so they can do what they love, which is code, and let Dosu do the rest.
[00:01:51] Devin Stein: Right now, Dosu is very popular within open source by helping maintainers with open source maintenance, so answering questions for users and GitHub issues and discussions, triaging issues, trying to identify is it user error, was it due to a recent change,is there a logical error in the code, and then also showing users how to contribute.
[00:02:12] Devin Stein: So if they want to, yeah, add a feature, how do they do.
[00:02:14] Simon Maple: Yeah, huge value to an open source maintainer. I can see why Dosu is exponentially gaining speed in this space. Let's jump into a screenshot. Now we're talking absolute golden path here, right?
[00:02:24] Simon Maple: There's a good answer. The user doesn't know what that good answer is. It's clearly available and Dosu helps here. so let's talk about the users workflow first.
[00:02:32] Devin Stein: So I guess stepping back for those, those who have never created an issue in open source, the status quo, is you go, you like running into your, you're blocked on something you go to GitHub, you create an issue, because you think that, it's either not possible or you're like, you really need help.
[00:02:49] Devin Stein: But unfortunately, like it's very,slow to typically get a response. If you create an issue, you typically the response time is hours, days, weeks, or months. So when you have a tool like Dosu or, AI generally, like the potential impact for unblocking users and also saving maintainers time because triaging issues takes a lot of time trying to understand where a user is coming from.
[00:03:14] Devin Stein: The time saving is pretty significant. Here is an example, like you were saying, Simon, of the happy path. So this is from Apache Superset, where someone, believes they found a bug in, PDF files in a chart. So Superset is an open source,BI tool, and they have a description,
[00:03:33] Devin Stein: it's like how to reproduce it, the issue they're running into, version, et cetera. It's actually quite a good issue when you think about GitHub issues. So those who goes in, it automatically labels the issue. Helping maintain or stay organized and then actually responds. And, it's able to, find that, the user in this case is actually wrong in that it's user error and it's an expected behavior, and it shows, how to implement the behavior that the, user is trying to correctly, and it does so by finding relevant code, and configuration files on the backend,
[00:04:08] Devin Stein: and then also showing, what's going on in the front end as well,and how it does, the front end also does handles, sanitization and, cross site filtering.
[00:04:19] Simon Maple: Where's it pulling, where's it pulling this kind of data from? Is it, is this from documentation, like public facing documentation?
[00:04:25] Simon Maple: Is it from code documentation? Is it from, guides, things like that?
[00:04:28] Devin Stein: A combination. So in this case, we always say, these have the site sources cited, and so you'll see here that it's citing code files in this case, but it can also use documentation either in the repository or host a website, as well as other data sources like a slack channel, for example, for community slack.
[00:04:48] Devin Stein: And this is something that AI is. super good at, being able to take in huge amounts of data, whether it's documentation, like Slack, like you say, Slack and other sources and really be able to map the kind of what someone's doing and recognize how that differs with this, the context that it's being given.
[00:05:06] Simon Maple: Is this, would you say one of the, one of the easier, paths for Dosu, for the AI behind Dosu to actually, get right most of the time?
[00:05:14] Devin Stein: Yes, I would say, we always try to lean on what our LLMs are good at, and one of the things is,LLMs have better memories, in a sense. By being able to ingest a ton of information, index it well, and search across it, you can create something that has effectively better memory than like a maintainer.
[00:05:32] Devin Stein: And so in that sense that this is something that those who can do well. if
[00:05:36] Devin Stein: you have forgotten about something or don't know where something exists or if it exists, LLMs can be quite exceptional at that.
[00:05:44] Simon Maple: Now in this case, of course, there was good sources that the LLM could gain, access to, which has the documentation, which the user could have read initially and like identified, okay, this is how I should have done this is user error.
[00:05:55] Simon Maple: Yeah. What about let's take the golden path to a slightly different level where the AI, Dosu, does really well, but it's not obvious to the user, to the developer, what that, that good user path is. So let's say the documentation doesn't exist. That kind of information would typically be in the code, in the maintainer's head, in, among a very select, few people.
[00:06:18] Simon Maple: How does the, in first example, it's clear how much value AI can help because it takes the time for the developer to have to respond to these tickets or even for the tickets to be created in the first place almost. It takes all of that away. How about this second level whereby, these tickets are still going to be created and if the information is not there, how can something like Dosu or AI help, find these answers?
[00:06:40] Devin Stein: Right. I think, yeah, I guess jumping to another example, here's one from Jaeger. which is a really popular tracing, framework and for distributed services. So very complex, and I think this is a really, interesting and powerful example of both Dosu, but also AI generally, where, something that, we, you know, me personally, very excited to start Dosu, is that, a lot of the times engineers are needed is because,
[00:07:07] Devin Stein: code is really the source of truth for how your product works. It'll tell you what the edge cases are, what is expected and unexpected. and here's an example where someone is asking about, timing of parent and child spans. And so this is like a very edge case nuanced question, that,there's unlikely to be documentation for, there might be, but in this case, there isn't, but, Dosu is actually able to use, in this case, test information.
[00:07:36] Devin Stein: So it's able to find relevant tests. That assert, the specific behavior that the user is asking about, to give a,authoritative answer on this question. so it's actually able to find a test that says that,in this case, that it's not always true, that the duration of the child spans,is contained with a parent span.
[00:07:56] Devin Stein: So it's very specific, but it's actually defined by a test. and, Dose was able to make that knowledge a bit more accessible.
[00:08:02] Simon Maple: Yeah, absolutely. And this is, no one's going to trawl through this. This is typically going to be maintain a known behavior or whoever wrote that test, adding that in.
[00:08:14] Simon Maple: It's almost like the tests here are actually more the source of truth than the code, because if you write code and it does x, y, and z, maybe actually x and y are intentional, z is just something that happens and it may be right, it may be actually a bug, whereas if a test was written it's arguably a greater chance that it's supposed to do that, unless it's stale or incorrectly written.
[00:08:37] Simon Maple: It's almost like this test becomes more the source of truth than the code.
[00:08:41] Devin Stein: and I think it also speaks to the importance of testing, generally, with, we talk about AI software development. like tests make it much easier to reason about what the expected behavior is. those who's able to reason about it from code, but,it's a lot more authoritative, and clear cut if there is a test saying this should be true.
[00:09:00] Devin Stein: and if the test passed, then, we believe it to be true.
[00:09:03] Simon Maple: Yeah, absolutely. Very interesting. and now it's actually, we've done the golden path. What happens when AI goes bad? What happens when AI goes wrong? this is an interesting one, right? if maintainers are putting that trust and faith in, in AI, I guess it's not always going to be, smell of roses is not always going to be good.
[00:09:22] Simon Maple: let's take us through an example of where perhaps the suggestions or the feedback to the user hasn't been always a hundred percent.
[00:09:29] Devin Stein: And in these past two examples we looked at, Dosu effectively saved both the user and maintainer a ton of time. The user didn't have to take through the code, was unblocked, maintainer didn't have to think about,triaging the issue or understanding the user's problem.
[00:09:43] Devin Stein: Here's a case where I think it, is a common failure point of LLMs and something that Dosu still struggles with, where someone's running into an issue, or looking at the open telemetry, repository right now, and they're trying to deploy it, with Helm, Which is, a tool in the Kubernetes ecosystem, and they're, and it's, they're running to an error.
[00:10:03] Devin Stein: And Dosu then, goes in and tries to come up with a solution. It suggests something based off looking at the code. it's like actually looking at the files, and it's I think you can do this. And in the case that, Dosu,this is a correct-ish answer, but it's slightly misleading.
[00:10:22] Devin Stein: So you'll see, what Dosu suggests is a viable workaround. So it's looking at the code saying,actually you could change it and do this, to get it to work. and it's correct, but the maintainer jumps in and explains that no, actually, You know, the answer is that this version, the Kubernetes version of the user site, is not supported.
[00:10:41] Devin Stein: and LLMs and Dosu, to a certain extent, still have a bias towards solution. If a user has a problem, they want to be able to solve it. And in this case, the answer was just, no, there is no solution. You're on the wrong version, essentially. Yeah,
[00:10:55] Simon Maple: yeah, it goes back to what we were talking about in the session as well, whereby LLMs, they don't want to say no, right?
[00:11:00] Simon Maple: They're always trying to find that solution. and sometimes, We talked about hallucinations and things like that, whereby sometimes they'll make up a solution, other times they'll actually find a solution, but it might not be exactly how a maintainer wants to present this component, or it might just plain be wrong.
[00:11:17] Devin Stein: And what's interesting about this is, I had a discussion with a maintainer about this specific issue. We looked into the logs, and if you look into kind of the trace, So in this case, Dosu actually found that the right answer, which was that it's not supported, but then continued and wanted to find a solution.
[00:11:34] Devin Stein: So even though it correctly identified, okay, this user is using the wrong version, it's not supported. It's what else could they do? So it's a bit of, prompting, flow engineering, trying to make sure that, the,if there isn't a solution like that is sometimes is a better solution than trying to find a workaround that might mislead users.
[00:11:53] Simon Maple: Yeah, and get that human in the loop and then it's okay to say no and pass things on to a human. Really interesting. Thank you. Yeah, again, as was mentioned here, actually, Devin, this is something that can not just help maintainers who spend a huge amount of time doing this kind of like triage stuff, but also users who do
[00:12:10] Simon Maple: get that faster response, that faster feedback, and turnaround time, and actually get a better experience of open source as a result. So it's really great that this kind of a experience can help both sides and make, make open source more consumable for users of both sides.
[00:12:24] Simon Maple: If people wanted to have a play with this, what was your recommended path to learn more?
[00:12:29] Devin Stein: Yeah, so like you're saying, Dosu, valuable to open source users and, maintainers, but also, can be used internally at enterprises. Any engineering org, you've got a lot of questions, a lot of issues that come up, Dosu can help.
[00:12:41] Devin Stein: The easiest way to get started is go to our website, dosu.dev. click on, I can't remember that, get early access, you can sign up, and we have a waiting list, but, we're taking people off as quickly as we can, so mention this podcast and we'll be sure to prioritize you.
[00:12:57] Simon Maple: Amazing. Thank you so much, Devin.
[00:12:59] Simon Maple: It's been absolutely fascinating to see this and I think,there's a huge,numbers of people who can really benefit from this, this type of usage of AI. Thanks very much, Devin, and, everyone, see you on the next episode.
[00:13:09] Devin Stein: Thanks, Simon.