Brokk: Compiler Grade, Massive Repo Aware Context
12 May 2025
•
Dion Almaer
Jonathan Ellis is a world class engineer. I saw this first hand when he worked with Ben Galbraith in a company they ran many moons ago. I then got to see him apply himself to the creation of Cassandra, the No SQL distributed database that truly did work at web scale.
He co-founded DataStax on the back of this success, and they were off to the races.
I had the fortune of running into Jonathan in Hampstead, and he shared an early version of this Swing desktop app… Brokk.
Given the large codebases that Jonathan was trying to use AI with, it is natural that he started to build Brokk after running into issues with LLMs getting and using the right context from said codebases.

I was somewhat fascinated that in a world of Python and JavaScript based tools, and Jonathan’s history with Python, Brokk was a Java based tool, but the main reason appears to be the desire for the code intelligence engine to go beyond ASTs with Tree Sitter, and instead offer full type inference. To do that, the library that could actually deliver on this was: Joern… and thus, JVM constraint it is.
Giving Brokk a Spin
I opened up a repo, parade
, which is a CLI to interface with model providers to validate and search for them. Unfortunately, it isn’t massive, nor is it Java, so it doesn’t fully show off the power of Brokk, but I was able to follow my pattern of planning and then acting on a plan via the Architect
and Code
systems.

It’s early days for Brokk, but I am excited its out in the open, and thanks to it being open source, we can easily follow the project as it learns new things.