
Macey Baker
7 min read24 Oct 2025
As Tessl's Founding Community Engineer, Macey Baker is busy helping to build the future of AI-native development (and helping you build it too). Now an AI code generation obsessive, she got her start as an early employee at tech unicorn Intercom in San Francisco, before jumping the pond and roaming the London startup scene, working primarily in big data / ML houses. Some of her best friends are LLMs.
When chatting with a coding agent, you’re engaging in an important iterative process. You’re working out what your needs really are. You’re navigating trade-offs, workarounds and new information together. But you and your agent speak two different languages.
At the end of your conversation, only one of those sticks around. Implementation lives on, and intent disappears into the void. We’ve gained code, but we’ve lost context. Some agents even put your conversation history in tmp/, stating definitively that your input is disposable, and implementation code is the truth.
The trouble is that on its face, implementation code can tell you what it does, but it often can’t tell you why it’s there, in that precise form:
There are answers to these questions… somewhere. But without context, you’re guessing at intent, and this is a lossy process. When agents (or people) guess intent, they will inevitably get some details wrong here and there, and a system will start to drift over time.
This isn’t a new problem, by any means. Intent has always been captured in ephemeral mediums: meetings, passing conversations, code comments and other forms of documentation (which, as we know, are always perfectly maintained). But now, we have all of these sources plus agent conversations, in which only one party (you) has a window into your code’s “why” — that is, as long as you can remember it.
As well, the consumers of your codebase are evolving. Other developers will probably see your code, but so may:
For the benefit of all of these actors, it’s time to start incorporating natural language into your codebases, effectively designing narrative arcs for the consumer. The existence of several spec-driven toolkits and frameworks available now, like Tessl, SpecKit and Kiro, concede this point.

Approach 1: Annotation
One perfectly defensible way to start is by annotating code, if you aren’t already. Take a hypothetical example: a notes app. You can imagine a storage module in which a top-level annotation, like the one below, condenses several files worth of context into a token-efficient blurb. A human, or agent, would need to traverse lots of lines of code to get the exact same information.
"""
The `NotesStorage` class is the database layer for the Notes app. It is bad
practice to access tables directly; always call into the functions defined
below.
Main functions:
- `save_note`: creates or updates a note. It checks permissions first, ignores
stale updates (idempotency), and then writes the record.
Every save also emits a `note_saved` event.
- `get_note`: fetches a note by ID.
- `delete_note`: removes a note and emits a `note_deleted` event.
The main caller is SyncService: it runs `save_note` when devices sync changes.
Note that `Permissions` is always checked before writes.
Events emitted here are picked up by SearchIndex (to keep results fresh) and
Notifications (to push updates to users).
"""Approach 2: Using formal specifications.
Another strategy is to use formal specifications. What exactly makes up a “formal spec” is down to taste, but it should contain at minimum:
Let’s look at the same storage module again. A strong spec might include some yaml frontmatter, followed by a description of the module’s behaviour:
---
module: NotesStorage
describes_files:
- storage.py
dependencies:
- permissions.py
- notifications.py
- events.py
- search_index.py
---
Persistence hub for `notes` objects: writes/reads, permission gate,
idempotency, and events for downstream services.
Responsibilities:
- Save or update a note
- Check `Permissions.can_write`.
- Ignore stale updates using `last_modified` (idempotency).
- Emit `note_saved` with `{note_id, user_id, last_modified}`.
- Delete a note
- Check permissions, remove if present, emit `note_deleted`.
- Retrieve a note
- Fetch by `note_id`; `None` if missing.You could also imagine the same information conveyed in a less structured, almost user-story flavoured way:
# Notes Storage
[storage.py](../impl/storage.py)
Persistence hub for notes: writes/reads, permission gate, idempotency, and
events for downstream services.
It saves or updates notes, verifying permissions and skipping stale updates
[test_save_note.py](../tests/test_save_note.py)
It deletes notes, verifying permissions, removing records if present and
emitting note_deleted events
[test_delete_note.py](../tests/test_delete_note.py)
It retrieves notes, fetching by ID, and returning either the record or
None (if not found).
[test_find_note](../tests/test_find_note.py)
It depends on permissions, notifications, events and search_index services.However you structure an individual specification, the magic of the “narrative arc” comes when specs become the main entry point for codebase traversal for both humans and agents:
/specs/
main.spec.md # overarcing program spec
storage.spec.md # bulk of storage logic
sync.spec.md # sync which conditionally calls storage
permissions.spec.md # permissions which storage checks
events.spec.md # events which storage fires
search_index.spec.md # search_index updated by sync
notifications.spec.md # notifications triggered by events
/src/
/tests/The path to understand the code is very clear just by looking at the file structure. Starting from main.spec.md, the consumer can navigate the codebase simply and in natural language. This way, you’re actually starting from intent, rather than trying to infer it from implementation details.
What about legacy code?
What about existing projects, and legacy code? Most engineers work on codebases that have existed for years, with many contributors. These codebases may have had some degree of intent-loss that can’t be reversed.
But we can prevent regression from this point. This will involve aggregating various sources of intent into meaningful annotations, or specs. Senior engineers, if they’re still around, can provide golden context for these artefacts. Otherwise, consider where else intent is captured in your organisation: PR conversations, commit histories, Slack conversations, Notion or Drive docs, user-facing docs, incident histories, et cetera. This admittedly tedious task will help speed up development, and even quicken onboarding, over time.
Prepare yourself for a painful cliché: we have entered a new era of software development - and it’s only just starting. LLM-era developers can distinguish themselves and their codebases by incorporating natural language into them, helping to future-proof them by preserving original intent.