Semver violations are common, better tooling is the answer

September 07, 2023 semver rust

This post is coauthored by Tomasz Nowak and Predrag Gruevski. It describes work the two of us did together with Bartosz Smolarczyk, Michał Staniewski, and Mieszko Grodzicki.

Anecdotally, cargo-semver-checks is a helpful tool for preventing the semver violations that every so often cause ecosystem-wide pain. This is why it earned a spot in the CI pipelines of key Rust crates like tokio, and also why the cargo team hopes to integrate it into cargo itself.

While anedotal evidence is nice, we wanted to get concrete data across a large sample of real-world Rust code. Inspired by Crater, A tool that builds a large number of public Rust crates and runs their test suites to check for Rust compiler regressions. we used cargo-semver-checks to lint the top 1000 most-downloaded library crates on crates.io. Our test setup has no connection to the Crater project. However, between us this work was affectionately known as semver-crater as a succinct and clear description of the work. We hope the team working on Crater doesn't mind this way of paying homage to our source of inspiration.

The outcome was a goldmine of valuable data.

TL;DR + table of contents

Long story short: semver accidents are common. They happen even in the most carefully-developed projects run by the most experienced maintainers. The maintainers are not to blame, and improved tooling is our best way forward. cargo-semver-checks is part of that improved tooling story, since it found every semver violation we report here.

Across more than 14000 releases We only considered non-yanked releases published in 2017 or later. We only scanned minor and patch releases, since major version releases do not have any semver obligations. We also skipped 919 releases which we were not able to build with a modern version of Rust. That left us with 14389 total scanned releases. of the top 1000 most downloaded crates, on average:

Around 1 in 31 releases had at least one semver violation – we found one or more violations in 464 releases (3.22%). If you never use the affected functionality, it's possible that a semver-violating release does not break your project. This means that most projects will break due to a semver violation in less than 1-in-31 dependency upgrades on average. But with many projects having hundreds of dependencies, the odds are still concerningly high!
More than 1 in 6 crates violated semver at least once – 172 crates (17.2%) had at least one release with a semver violation!

The most common sources of semver violations were:

A change to an exhaustive type, like adding new variants to an existing enum, or adding new fields to an existing struct that previously contained only public fields.
A struct that has been removed, or a type which had one of its methods removed,
A type that stopped implementing one or more auto traits. A previous post has an example of how such semver violations may go unnoticed. While unexpected removals of Send or Sync can be frustrating, they are just the bearers of bad news. Making a type stop being thread-safe is a major breaking change in every language, and Rust merely chooses to point that out at compile-time instead of in production at 3am.

We'll dig deeper into our findings in a bit. First, we have to discuss a key point: none of this is maintainers' fault.

This is a failure of tooling, not humans

Thanks to semver, cargo update can easily upgrade dependency versions to bring in performance upgrades, security fixes, and new functionality. These benefits are significant and should not be understated.

Unfortunately, the benefits come at a cost to maintainers. While many semver rules seem "obvious," there's also a long tail of complex rules with tricky edge cases. For example, editing the details of private types can sometimes result in a major breaking change in a public API elsewhere in the library — in more than one way. Spooky action at a distance!

Demanding perfection from maintainers would be naïve, unreasonable, and unfair. Whenever hardworking, conscientious, well-intentioned people make a mistake, the failure is not with the people but in the system.

Blaming human error would also be out of line with Rust's existing practices. After all, Rust adopted borrow-checking to address accidental and costly mistakes originating from another system of complex rules. The parallels to semver and cargo-semver-checks are clear: in both cases, we rely on automated systems to check the rules that are not amenable to manual checking by humans.

Analyses like this one are key to learning how we can do better. Our findings help us understand the needs of the ecosystem, contextualize our impact thus far, and determine how to best help Rustaceans going forward.

Detailed results & how we validated them

Automated linters can sometimes have false-positives, so we spent substantial effort on validating our results.

We discovered a total of 3062 verified semver violations across all scanned crate releases. Each of those was first reported by cargo-semver-checks and then validated by a combination of automated and manual means.

Detailed results (click to expand)

Here is a table showing all the different kinds of verified semver violations we discovered. We show which cargo-semver-checks lint caught each semver violation, and how many different releases and crates had that kind of violation.

lint name	individual items	different releases	affected crates
inherent method missing	791	41	27
enum variant added	382	138	60
constructible struct adds field	343	123	34
auto trait impl removed	318	57	45
struct missing	291	66	40
function missing	267	50	33
inherent method const removed	139	5	3
derive trait impl removed	115	11	11
enum variant missing	112	27	18
struct pub field missing	79	32	16
enum missing	78	26	20
trait missing	45	24	19
method parameter count changed	22	14	12
enum marked non-exhaustive	16	4	4
struct repr(C) removed	12	3	3
constructible struct adds private field	9	7	6
inherent method unsafe added	9	3	3
function parameter count changed	8	4	4
function unsafe added	8	2	2
unit struct changed kind	5	3	2
enum tuple variant field missing	4	2	2
tuple struct to plain struct	4	2	2
enum tuple variant field added	3	3	3
enum repr int removed	1	1	1
enum struct variant field added	1	1	1

As part of our validation process, we discarded approximately 10000 other instances where cargo-semver-checks reported an issue that was determined to be either erroneous (confirmed false-positive) or inconclusive (e.g., causing rustc to crash when attempting to use the affected release in a new crate).

Here are the major components of our validation process.

Automated validation via "witnesses"

For each reported semver violation, we created a witness – a code snippet that compiles on the older library version, but fails to compile on the newer version due to the semver-violating change. This is how we prove that code external to the library, such as code in a downstream use case, can be impacted by that semver issue.

For example, imagine a library with the following code:

pub enum Example {
    First,
    Second,

    // Imagine the following variant is added
    // in a minor version. This violates semver,
    // since `Example` is an exhaustive enum.
    Third,
}

The witness for this code would look like this:

use dependency::Example;

fn witness(value: Example) {
    match value {
        Example::First => {}
        Example::Second => {}
    }
}

This snippet compiles successfully with the original version, but is affected by the breaking change in the new version:

error[E0004]: non-exhaustive patterns: `Example::Third` not covered
  --> src/lib.rs:4:11
   |
4  |     match value {
   |           ^^^^^ pattern `Example::Third` not covered
   |

Handling the `#[doc(hidden)]` attribute

In the code above, what if the Example enum was marked #[doc(hidden)]? Items marked with this attribute don't appear in the documentation of the crate's public API, but are still accessible outside the crate. This can be useful, for example, in crates that expose macros: the macros' internal implementation details are usually not themselves a stable public API, even though they must be public for the macros to work. #[doc(hidden)] items therefore have reduced semver obligations: if our Example enum above was #[doc(hidden)], adding a new variant would not have violated semver. Interestingly, #[doc(hidden)] items still have some semver obligations.

While we've done some work on correctly handling #[doc(hidden)], today's version of cargo-semver-checks still has a false-positive here. Thanks to this survey, we saw that #[doc(hidden)] is by far the most common source of false-positives in cargo-semver-checks. We are prioritizing shipping a fix here. A witness wouldn't detect this as false-positive, either — it would also claim a violation.

We discarded over 6000 such false-positives! We used a combination of automated and manual triage, ensuring that flagged items are neither directly hidden nor indirectly hidden via #[doc(hidden)] on a containing module.

Our automated triage process relied on rustdoc's JSON output format. It detected hidden items by finding items that are emitted only when rustdoc is passed the nightly-only --document-hidden-items flag.

We followed this up by manually inspecting the source code of any items that were not eliminated as hidden via automated means. This step protected our results against possible false-negatives caused by bugs in our automated script, in rustdoc or its JSON backend, or in the nightly-only rustdoc flag we used.

Non-exhaustiveness prior to `#[non_exhaustive]`

These days, it's easy to forget that #[non_exhaustive] is a fairly recent addition — it was only stabilized in Rust 1.40, released in late December 2019. Our analysis covers releases made from 2017 onward, covering 3 years in which #[non_exhaustive] did not exist in the Rust language. In 2023, we expect that non-exhaustive types are marked #[non_exhaustive], and additions to exhaustive types are a clear-cut major breaking change. It seems unfair to apply the same standard to code released in 2017–2019.

Semver is about communicating expectations with users. Prior to the introduction of the #[non_exhaustive] attribute, maintainers noted non-exhaustiveness in doc comments or via enum variants with names like __Nonexhaustive. As these were the community-accepted ways of indicating non-exhaustiveness at the time, that is the standard to which we held crates in our analysis. We manually triaged exhaustiveness violations with those kinds of documented non-exhaustiveness. This is an example of how the rules of semver change over time as a function of community expectations. In 2018, Rustaceans might have expected an enum to have its non-exhaustiveness communicated via a doc comment or a __Nonexhaustive variant. In 2023, we expect that non-exhaustive enums have the #[non_exhaustive] attribute — if the attribute isn't set, we probably wouldn't look for exhaustiveness information in the enum's doc comment. Then consider the act of adding a variant to an enum only specified as non-exhaustive in a doc comment: that's a major breaking change in 2023, but not in 2018.

Consulting maintainers

Having verified our results via both automated and manual means, we decided to add one last check: we privately reached out to several maintainers of affected crates and discussed our findings with them.

In all cases, those maintainers confirmed our findings as correct.

In most cases, the maintainers stated the semver violations were novel, and not previously discovered nor reported anywhere to their knowledge.

In a tiny number of cases, maintainers reported making a semver-breaking change on purpose. In one example, a part of a library was unintentionally made public in one release and that change was rolled back in the subsequent release, which is technically a removal of public API.

Such situations are why cargo-semver-checks aims to aid and inform maintainers, not take away their power to decide what's best for their crate. We consider semver-checking akin to the cargo publish check about uncommitted changes: inform the user about the findings, but allow them to explicitly opt into proceeding if they are confident that's the right thing to do.

This is only a fraction of all semver violations

While this work found many real-world semver violations, our current setup could only hope to detect a fraction of all such issues.

There are good reasons to believe there are many more semver issues still to be discovered:

cargo-semver-checks is currently able to detect only a subset of semver violations. More lints are added in every new release, so repeating this analysis in the future is likely to find more semver violations in the same crate releases.
We only counted semver violations for which we were able to construct a witness. In some cases, cargo-semver-checks reported violations in code that relied on complex uses of generics where our witness-generator failed to produce a working witness. We consider such cases inconclusive, and believe that re-analyzing our dataset with more sophisticated witness generation may confirm more semver violations among these inconclusive cases.
Scanning the top crates likely makes our analysis heavily biased toward Rust code written by highly experienced maintainers. We believe that analyzing a broader set of crates is likely to produce a more-than-linear increase in discovered semver violations, since many ways to accidentally break semver would likely be non-obvious to many Rustaceans.
We semver-checked only the crates' default features. Crate features produce a combinatorial increase in a crate's API surface area, so it's likely that checking more feature combinations would find evidence of more semver issues. We only checked default features to avoid compilation failures due to platform-specific code tucked behind feature flags. Otherwise, cargo-semver-checks by default checks all crate features except ones with names commonly used to indicate unstable or internal-only code.

Just scratching the surface of our work

This case study summarizes several engineer-years' worth of work done by five people. It shows that cargo-semver-checks can discover semver violations in real-world Rust code, and is therefore effective in helping today's maintainers avoid semver violations in their new releases.

But this is just a slice of what we built and discovered. We didn't get to talk about many other interesting topics, like:

how witness-generation works and how it might one day become part of cargo-semver-checks itself,
how #[doc(hidden)] items sometimes have semver obligations after all,
how cargo-semver-checks can also be used to discover crates whose features are not additive, or
all the cargo-semver-checks bugs we discovered and fixed as a result of proactively scanning such a large number of crates.

If you are curious to learn more, we have a few resources for you to check out!

The work described in this post was part of the bachelors' thesis project for Tomasz Nowak, Bartosz Smolarczyk, Michał Staniewski, and Mieszko Grodzicki. Their thesis is available here, and contains many more details that we couldn't fit here.

More information on cargo-semver-checks is available on its GitHub page. It's safe to assume that the vast majority of bug reports opened by Tomasz, Bartosz, Michał, Mieszko, or Predrag in the last year were discovered as a result of the semver survey described in this post.

Various nuances of semver in Rust have already been covered on this blog, and more posts on the subject are sure to follow. You can subscribe to this blog via RSS or via email.

If you maintain Rust crates, are you using cargo-semver-checks already? Why, or why not?

Discuss on r/rust or lobste.rs.

Thanks to Tim McNamara, Luca Palmieri, Steve Klabnik, oli-obk, weihanglo, and Ed Page for their feedback on drafts of this post. All mistakes belong to the post authors alone.