
The Missing Gap In Workflows For AI Devs
Also available on
Chapters
In this episode
Baruch Sadogursky, Head of Developer Relations at TuxCare, joins Simon Maple to explore why automated integrity needs to be built in before we rely on AI outputs.
On the docket:
• the difference between specs and tests
• why PMs sidelined specs
• the "intent-integrity gap" between human goals and LLM outputs
• the non-determinism of LLMs as a feature, not a flaw
• Baruch’s belief: devs are not going anywhere
Introduction: Revisiting Trust in Code at AI-Fokus
Simon and Baruch reconnect at the AI-Fokus conference to explore how LLMs, specifications, and modern development practices are reshaping software engineering. They reflect on their shared history and shift the conversation toward the integrity of AI-generated code.
The Core Issue with AI-Generated Code
AI-generated code often lacks developer trust. Developers tend to avoid reviewing code they did not write, especially when produced by machines. This problem mirrors poor human code review practices and highlights a need for better accountability mechanisms.
Tests as Guardrails and Specs as the Foundation
Baruch proposes that software quality should be driven by well-defined tests. If code passes trustworthy tests, it can be accepted without manual inspection. However, for this to be effective, those tests must be generated from a clear and agreed-upon specification. The specification becomes the authoritative source of truth, accessible to both technical and non-technical stakeholders.
The Promise and Limitations of BDD and Gherkin
Behavior-Driven Development (BDD), supported by Gherkin syntax, attempted to make specifications readable and writable by all stakeholders. While human-readable, Gherkin proved too rigid for product managers and non-technical users, limiting adoption. Additionally, the disconnect between specifications and implementation caused them to become outdated and unmaintained.
The Intent Integrity Chain
Baruch introduces the concept of an intent integrity chain, a structured process for aligning software with human intent:
Begin with a prompt or product definition.
Generate specifications using an LLM and review them with stakeholders.
Compile specifications into deterministic tests (outside the LLM).
Use LLMs to generate code until it passes those tests.
Lock tests to prevent tampering and ensure reliability.
In this model, code is treated as a disposable output, with integrity preserved through specifications and tests.
Microservices and Iterative Regeneration
The conversation emphasizes the importance of modular architecture. With microservices, updates or new requirements can be addressed by regenerating individual components. The prompt and spec drive the process, allowing for scalable and maintainable evolution of the system.
Expanding the Role of Specification with Tessl
Tessl is positioned as a more capable successor to Gherkin. It allows for richer specifications that include behavioral expectations, API interfaces, and non-functional concerns such as performance, security, and language preferences. This avoids overloading specs with information they were never designed to carry.
Continuous Validation and Feedback Loops
The podcast highlights the value of feedback loops from production telemetry and quality metrics. These inputs can inform changes to specs, enabling continuous iteration while maintaining alignment with original intent.
The Evolving Role of the Developer
In a spec-centric future, developers will not disappear but evolve. Some will focus on architecture and composability, while others will act as domain experts ensuring feasibility and guiding prompt formulation. Technical knowledge remains critical, particularly for non-obvious constraints and system-wide considerations.
Conclusion
The intent integrity chain, when implemented with advanced spec tooling like Tessl, provides a reliable structure for developing software with AI. It allows teams to scale LLM-based development while maintaining trust, aligning stakeholders, and ensuring that code reflects shared intent.