cargo-semver-checks ends 2022 with 40,000 downloads from crates.io, able to prevent 30 different kinds of semver issues, and having done so in real-world use cases. Inspired by Yoshua Wuyts' "Rust in 2023 (by Yosh)" post, here are my thoughts on cargo-semver-checks in 2022, and what I look forward to in 2023 and beyond.

Following semver in Rust is a perfect example of a workflow worth automating:

Some might say the solution is to "git gud". I deeply respect operational excellence, but this is not the way.

Civilization advances at the rate at which we develop robust abstractions. I am writing this on a computer I cannot build, under a blanket I cannot weave, having enjoyed a meal with ingredients I cannot grow. I dedicated ten years to math competitions, And developing test-taking strategies aimed at getting a perfect score given limited time! and I can't even calculate a logarithm by hand! Can you? If my life depended on it, I'd use the Newton-Raphson method to approximate my way to it, but there's zero chance that's actually the best way. My friends with aero-astro engineering degrees still find it hilarious that I once used binary search to calculate orbital maneuvers for Kerbal Space Program, instead of the closed-form formula that apparently existed 😅

Gatekeeping to only include people with a PhD in "Semver in Rust" won't cut it.

Yosh Wuyts quotes another Rust contributor as saying: "The job of an expert is to learn everything about a field there is to learn, and then distill it so that others don't have to." I'll gladly put their name here if the quote is confirmed as coming from them. I wasn't present when this was said, and didn't want to risk misattributing. I couldn't agree more!

2022: Rust + semver - tedium = 💖

cargo-semver-checks was born in mid-July 2022, when I realized that building a semver linter boils down to only two things:

At a high level, that's all cargo-semver-checks is: a checklist, and a for-loop over it.

As is usually the case:

The novel trick in cargo-semver-checks is that lint rules are written declaratively.

Given the need to have hundreds of different lints defined over an ever-changing data format, The rustdoc JSON format is unstable and frequently has breaking changes — sometimes even multiple times per week in nightly Rust. this is a huge win.

But creating a good declarative query language is a much harder problem than semver! Generally one shouldn't replace an easier problem with a harder one. This is why linters rarely build their own query language.

Fortunately, I spent the last 7+ years of my career working on high-performance query languages for heterogeneous data, so I didn't need to start from scratch. Instead, I just plugged in my existing Trustfall query engine which is able to query any data source(s) no matter whether they are local files, remote APIs, or a terabyte-scale SQL cluster. Ever wonder which lints do popular crates like itertools allow in their code? Or maybe you're curious which GitHub or Twitter users comment on HackerNews stories about OpenAI? The answers are one browser-executed query away!

Thanks to Trustfall, each cargo-semver-checks lint is a type-checked structured query in Trustfall's GraphQL-like syntax. (More on this in future blog posts!) In practice, this means:

All this allowed us to go from zero to 30 different semver lints in just five months.

We are ending 2022 on a particularly high note: four students have begun contributing to cargo-semver-checks as part of their Bachelors' theses! The pace of development has sped up dramatically thanks to their hard work, and the codebase is healthier than ever.

Looking ahead to 2023

At RustConf 2022 I had the pleasure of meeting several cargo team members, and we decided that the end goal for cargo-semver-checks is merging into cargo itself.

Another goal for cargo-semver-checks is adding even more lints to prevent more kinds of semver violations.

These goals are self-explanatory, and I won't dig into them further. Instead, I'll mention three of my personal favorite things I'd like to see in cargo-semver-checks in 2023.

Proactively discover and prevent false-positives

A false-positive error in cargo-semver-checks is when the tool incorrectly claims it found a semver violation. I consider false-positives extremely serious bugs Much more serious than false-negatives! A false-negative means there was a semver violation but the tool didn't find it. There are dozens of ways to break semver that cargo-semver-checks can't yet detect, each of which is a false-negative. because they give the user incorrect advice, confusing them and slowing them down while also hurting the credibility of cargo-semver-checks itself.

Unfortunately, in 2022 our users reported multiple false-positive errors. I am grateful to everyone that spent their precious time helping debug problems that shouldn't have happened in the first place.

We have already begun strengthening the cargo-semver-checks test systems to discover and prevent future false-positives, so our users won't have to. In the process, we already discovered and fixed three previously-unknown false-positives.

In 2023, we plan to take a page from Rust's book: testing cargo-semver-checks on the most popular crates on crates.io as part of our release process. This would have a dual benefit: in addition to proactively discovering false-positives, it would also ensure cargo-semver-checks is ready to be adopted by those crates at their maintainers' convenience. And if we happen to discover more semver issues in the wild, that'll be a nice bonus!

Faster semver-checking via rustdoc caching

A cargo-semver-checks run consists of two steps: generating rustdoc JSON, and running lints over the generated JSON files.

The "run the lints" step is much faster Even though we've put in negligible effort at optimizing them beyond what Trustfall provides out of the box. than the process of generating the rustdoc, which can take a few minutes in CI environments with low core counts like GitHub Actions.

In 2023, we'll implement rustdoc caching to limit how often the rustdoc has to be rebuilt.

We expect to cut rustdoc generation time in half: we'll still have to generate the current version's rustdoc, but we can avoid repeatedly rebuilding rustdoc for crate versions that are already published on crates.io.

Semver-check PRs, not just cargo publish

Currently, cargo-semver-checks is most ergonomic when used right before cargo publish: it checks whether the publish step with the specified version If the version in Cargo.toml is already on crates.io, it assumes a patch version bump. would result in a semver-compliant release.

But wouldn't it be nice to know about breaking changes in a pull request before merging it and committing to a major version bump? Multiple projects have already begun running cargo-semver-checks like this, generally via custom scripts they've adapted specifically for that purpose.

In 2023, I hope we're able to make this an officially-supported mode of operation, complete with a GitHub Action. Bonus points if the Action reports semver issues as inline PR comments using the lints' span information!

Onwards!

I'm thrilled and humbled by the response that cargo-semver-checks has received in the Rust community. I've never been more excited about building the future with Rust, and I'm excited to see what 2023 has in store for cargo-semver-checks and the Rust ecosystem as a whole.