Logo

Next Wave?

Disposable

Software

Alex Gavrilescu
Lead Backend and Web Developer, Funstage GmbH
Back to podcasts

AI-First Project Management for Developers

with Alex Gavrilescu

Transcript

Chapters

Trailer
[00:00:00]
Introduction
[00:00:54]
Challenges and Solutions in AI-Driven Development
[00:02:25]
Demo of Backlog MD and Its Features
[00:09:44]
Challenges in Task Breakdown for AI Agents
[00:22:09]
Leveraging AI for Efficient Task Management
[00:23:43]
AI's Role in Agile and Software Development
[00:27:18]
Future of AI in Software Development
[00:41:35]

In this episode

In this live episode from Devoxx Belgium, AI Native Dev host Simon Maple chats with Alex Gavrilescu, creator of Backlog.md, about transforming ad-hoc "vibe coding" into a structured engineering practice with AI. They explore how spec-driven development and atomic Markdown tasks can make AI coding effective, safe, and scalable, offering a practical blueprint for AI-native teams to enhance workflow consistency and reliability.

Recorded live at Devoxx Belgium, this episode of AI Native Dev brings a candid, practical look at turning “vibe coding” with LLMs into a disciplined engineering practice. Host Simon Maple sits down with Vienna-based lead engineer Alex Gavrilescu—creator of Backlog.md—to unpack how spec-driven development makes AI coding effective, safe, and scalable. From repeated prompt fatigue and lost context to a CLI-first workflow that bakes in acceptance criteria and dependencies, Alex shares a blueprint any AI-native team can adopt.

From Vibes to Velocity: Why Ad-hoc Prompting Breaks at Scale

Alex started where many developers do—throwing prompts at capable models like Claude and Claude MD and letting them code. The early results were tantalizing: the agent could often reach the feature goal. But each task required heavy back-and-forth, and the same instructions had to be re-typed in every session. As soon as a chat closed, the agent forgot crucial constraints, leading to recurring mistakes and inconsistent output across tasks.

He also hit the limits of LLM conversation mechanics. Even when he wasn’t close to max context, quality degraded as he injected larger specs. Some agents “compact” conversation history into summaries, but critical instructions often get dropped. The upshot: ad-hoc prompting scales poorly because it lacks persistent context, repeatable guardrails, and a way to enforce consistency across tasks.

Just as importantly, vibe coding ignores the operational guardrails engineering teams rely on. Security practices, CI/CD constraints, staging environments, and language/framework standards all got lost in the shuffle. Alex’s insight was to import the rigor of human processes—PRDs, Scrum discipline, acceptance criteria—into the AI collaboration model. With the right specs in the right shape, agents become much more reliable teammates.

Atomic Markdown Tasks: The Spec Format That LLMs (and Teams) Can Execute

Alex’s first attempt at structure was a giant Markdown document that captured everything: feature specs, security requirements, CI/CD expectations, language decisions (C#, TypeScript, Java), and more. It was comprehensive—but not usable. Large monolithic context proved brittle. Summarization/compaction dropped key rules, rollback was painful, and the agent’s performance was inconsistent.

The breakthrough was to split the monolith into atomic Markdown tasks, each mirroring a Jira/Linear ticket: a clear title, a short description that explains the “why,” and acceptance criteria that are testable and measurable. Dependencies between tasks ensure the agent doesn’t start work until prerequisites are met. This format provides a minimum viable context for the agent to do high-quality work while keeping specs legible for humans.

Two more ideas make the loop robust. First, an “Implementation Plan” drafted by the agent before coding creates an explicit, reviewable approach. This forces alignment and surfaces risks early. Second, “Implementation Notes” capture what actually happened—permanent context the team and future agents can rely on. Together, these elements create a feedback-safe, auditable trail. If a change needs rolling back, you revert a single task and its notes—no more all-or-nothing spec reversions.

Backlog.md: CLI-First, Git-Native, and Agent-Friendly by Design

To streamline this workflow, Alex built Backlog.md, a developer-first backlog you manage entirely from your terminal. Install it globally via bun, npm, or Homebrew, and it’s instantly available in any repo. The CLI guides you with a command palette: create tasks, list them, open a Kanban board, launch a web interface, or get an overview of progress.

The board is configurable (default: To Do, In Progress, Done). Hit enter to drill into a task and you’ll see the full spec: ID, title, metadata, dependencies, description, acceptance criteria, and sections for Implementation Plan and Implementation Notes. You can label tasks (e.g., “security,” “CICD,” “frontend”), filter by status or priority, and manage everything without leaving the terminal. A web UI is available for those who prefer a visual interface; terminal drag-and-drop (think Shift+Arrow to move tasks between columns) is on the roadmap.

Backlog.md is Git-native. Tasks live in your repository and sync across branches. If you pick up Task #200, assign it to yourself, and set it to In Progress on your feature branch, teammates see that update on the main branch as soon as you push. The tool reconciles task state based on last-updated timestamps, keeping the “source of truth” simple and distributed. Crucially, Backlog.md also offers a plain mode—backlog task --plain—that outputs clean, agent-friendly text. This is the view you feed to your LLM so the spec is unambiguous and free of visual noise.

The result is a tight loop: specs and code live side by side, status changes are versioned, and agents consume the same canonical task text that humans review. Alex even demoed tasks completed by an agent, with acceptance criteria automatically ticked and notes captured—proof that the model and the workflow can meet in the middle.

A Practical AI Dev Loop: Plans, Guardrails, and Easy Rollbacks

What emerges is a repeatable, low-friction workflow any AI-native team can adopt:

  • Start with a small, atomic task. Define the why (description), the what (acceptance criteria), and the constraints (language choice, security posture, CI/CD requirements, staging gates). Add labels and set dependencies so the agent can’t start prematurely.
  • Ask the agent for an Implementation Plan before coding. Review and refine it. This is where you catch risky changes to auth flows, performance assumptions, or schema migrations.
  • Let the agent implement against the acceptance criteria. Because criteria are testable and measurable, they double as your validation checklist and can hook into automated tests or smoke checks in CI.
  • Commit task updates and push to a feature branch. The board reflects real-time state. Use staging to validate security and operational constraints—guardrails that vibe coding often forgets.
  • Capture Implementation Notes. This permanent context prevents the “Groundhog Day” effect where instructions are repeatedly reintroduced. If something goes wrong, roll back at the task level without losing unrelated work.

Alex cautions against “kitchen sink context.” Feeding everything to the model reduces reliability; summarization and compaction can silently drop the very rules you care about. Instead, give the agent just enough well-structured context per task. Combine that with a plan-review step, explicit dependencies, and notes, and your AI code contributions become both predictable and auditable.

Looking ahead, Backlog.md will continue to refine the UX (e.g., terminal drag-and-drop) while preserving its agent-friendly primitives. The philosophy remains the same: keep specs small, precise, and close to the code—and make it trivial for both humans and LLMs to execute them.

Key Takeaways

  • Don’t rely on vibe coding. Ad-hoc prompting leads to repeated mistakes, lost constraints, and risky changes landing in production.
  • Use atomic Markdown tasks with acceptance criteria. Keep specs small, testable, and measurable; avoid monolithic context dumps.
  • Capture dependencies and guardrails. Block tasks until prerequisites are met and include security and CI/CD requirements in the spec.
  • Add an Implementation Plan step. Have the agent propose a plan, review it, and only then proceed—this prevents avoidable rework.
  • Keep permanent context with Implementation Notes. Persist what changed and why, so future agents and teammates don’t repeat past errors.
  • Make specs agent-friendly. Use Backlog.md’s --plain output when feeding tasks to LLMs to avoid formatting ambiguity.
  • Keep tasks and state in Git. Backlog.md syncs across branches, making status updates and rollbacks versioned and collaborative.
  • Install once, use anywhere. Backlog.md via bun/npm/Homebrew gives a fast, CLI-first workflow with an optional web UI for visibility.
  • Optimize context, don’t maximize it. Big summaries and compaction can drop critical rules; minimal viable context per task is more reliable.
  • Treat AI like a teammate in your process. Bring your team’s agile discipline—PRDs, staging, reviews—into your AI development loop.

Chapters

Trailer
[00:00:00]
Introduction
[00:00:54]
Challenges and Solutions in AI-Driven Development
[00:02:25]
Demo of Backlog MD and Its Features
[00:09:44]
Challenges in Task Breakdown for AI Agents
[00:22:09]
Leveraging AI for Efficient Task Management
[00:23:43]
AI's Role in Agile and Software Development
[00:27:18]
Future of AI in Software Development
[00:41:35]