I had a lot of fun spending nights-and-weekends time participating in the HYTRADBOI Jam, a global hack week aimed at building "exciting and weird" data-centric solutions to familiar problems. The name HYTRADBOI might sound familiar: the jam is associated with the same conference where I gave my "How to Query (Almost) Everything" talk talk in April this year.

I jammed on two projects: one solo and one with a friend. The projects ultimately were very successful and mostly-successful, respectively.

I'll summarize the goals and progress on each project here. I'll save the deep technical details for a future post.

Semver queries over multiple rustdoc versions

My solo project was to add support for multiple rustdoc versions to cargo-semver-checks. That's the semver linter for Rust I built using Trustfall a few months ago.

Before the Jam, each cargo-semver-checks version could only support a single version of rustdoc. As an unstable Rust feature under active development, the rustdoc JSON format changes frequently: I counted 7 breaking changes in a ~two-month period.

Thus, users had to match their version of cargo-semver-checks to the nightly Rust compiler version they were using. Also, every time a new nightly included an updated rustdoc JSON format, I had to scramble to publish a corresponding update for cargo-semver-checks. All around, this was less than ideal.

My Jam project was to use Trustfall to decouple the semver lint queries from the changes in the rustdoc JSON format version. In broad strokes, JSON format changes don't affect the data model semver queries are interested in — Rust functions, structs, fields etc. — so the Trustfall adapter can absorb format changes and prevent them from leaking through to the queries. More details in a future post!

This project was very successful: cargo-semver-checks version 0.12.0 includes this functionality, and none of the semver queries needed any changes whatsoever. From this version onward, each new cargo-semver-checks version will support at least the most-recent stable and beta Rust versions. Version 0.12.0 simultaneously supports three rustdoc format versions:

Unlimited-depth recursion in Trustfall queries

The @recurse directive is one of Trustfall's key improvements over GraphQL. For example, it allows queries like "What comments and replies up to 3 levels deep have been posted on HackerNews' top stories?" This isn't the same as nesting the reply edge in that query three times: that would give us all replies that are exactly 3 levels deep in the comment tree, instead of all comments at levels 0-3 inclusive.

When deploying Trustfall over a dataset, engineers don't need to worry about the complexity of implementing recursion over the edges (relations) in their data model. The Trustfall interpreter handles the execution of @recurse directives transparently, and the underlying dataset adapter sees only a series of non-recursive edge-traversal requests.

This outward-facing elegance comes at the cost of significant internal complexity within Trustfall. In fact, the internals that make @recurse work are the most complex code in the entire project, and are already the product of many months of painstaking work.

But my friend Bojan and I love to push the limits of what can be achieved. We wanted to further improve @recurse in two ways:

Several prototypes and lots of collaboration later, we have mostly achieved our objective: we have a draft PR containing a directly-optimal @recurse implementation that can support unlimited-depth recursion.

Our traces show the new implementation is more than twice as fast at delivering a query's first result compared to the existing @recurse execution approach. As the recursion depth increases, the new implementation's performance advantage grows proportionally — the more-than-2x improvement is a lower bound.

The best part: this is all without making any changes to the adapter interface used to plug in data sources into Trustfall. All Trustfall users would benefit from this improvement without needing to change even one line of query or data adapter code.

This project was mostly successful. We had (rather ambitiously!) hoped to have the new implementation merged by the end of the Jam. However, Trustfall's thorough test suite revealed that our new code currently mishandles a particularly subtle edge case that can arise when recursing in particularly diabolical schemas.

We won't let one edge case deter us from our pursuit of optimal performance! But first, we need some rest after a week of thrilling jamming 😴

Thanks to Jamie Brandon for organizing the HYTRADBOI Jam! Participating was a lot of fun, and I am hoping there will be many more Jams to come.

This blog post includes joint work with Bojan Serafimov. However, any mistakes in this blog post are mine alone.