Some Rust breaking changes don't require a major version

January 26, 2023 semver rust

I've been saying for a while now that semantic versioning in Rust is tricky and full of unexpected edge cases.

My last post mentioned that some Rust structs can be converted into enums without requiring a major version bump. It introduced a non-exhaustive struct called Chameleon that had no public fields, and claimed it was totally safe to turn it into an enum. But surely there was some sort of mistake, since syntax like let Chameleon { .. } = value would break if the Chameleon struct became an enum?

Yes, that statement would break. Thanks to the alert readers on r/rust and Mastodon that pointed it out and even provided Rust Playground links! This discussion was particularly nuanced and interesting. And yet, this breaking change does not require a major version under Rust's semantic versioning rules!

How could a breaking change not be a semver-major change? Let's dig in and find out!

All major changes are breaking, but not all breaking changes are major

There are two authoritative sources for semantic versioning in Rust: the cargo semver reference, and the API evolution RFC.

Here's what the API evolution RFC says about breaking changes:

What we will see is that in Rust today, almost any change is technically a breaking change. For example, given the way that globs currently work, adding any public item to a library can break its clients [...] But not all breaking changes are equal.

So, this RFC proposes that all major changes are breaking, but not all breaking changes are major.

Rust API evolution RFC

This feels ... strange. Running cargo update will by default update dependencies to the latest version in the same major version series, yet breaking changes are allowed without a new major version?

Ultimately, I feel I only recently got involved in Rust's semver story by working on cargo-semver-checks, so I wasn't part of the discussion or decisions in the API evolution RFC. This post is on my personal blog, not the Rust blog, so you're reading my own opinion and interpretation. this is a case of Rust choosing pragmatism — and in my opinion, getting it right. I'll try to convince you of this in the rest of this post.

Let's take a look at two examples where breaking changes are explicitly not semver-major.

Adding a new public item is technically a breaking change

Let's pretend that in the below example, first and second are dependency crates of our library.

pub mod first {
    pub struct Foo;
}

pub mod second {
    // what happens if we uncomment this?
    // pub struct Foo;
}

use first::*;
use second::*;

fn process(foo: &Foo) {
    // do stuff with foo
}

Our library uses globs to import all public items from first and second. This works fine!

Now imagine second adds some new functionality: uncomment its pub struct Foo line. This is a purely additive change: second can still do everything it could previously do, and has gained some new functionality via the new type Foo. Purely additive API changes are semver-minor, right?

Try compiling the code after uncommenting that line, though. 💥 Oops! 💥

error[E0659]: `Foo` is ambiguous
  --> src/lib.rs:13:18
   |
13 | fn process(foo: &Foo) {
   |                  ^^^ ambiguous name
   |
   = note: ambiguous because of multiple glob imports of a name in the same module

< ... fix suggestions omitted for brevity ... >

The code that depends on both first and second was broken by second's purely additive change. Additive or not, it was unquestionably a breaking change. If you read the API evolution RFC's section on adding public items, you may have noticed that its example of breaking code by adding a public item is much shorter than the one here — and also that in today's Rust, that example isn't broken anymore! This is because Rust adopted another recommendation from that RFC: if a locally-defined item's name conflicts with a glob-imported name, the local item "wins" and shadows the other one instead of breaking with an ambiguous resolution error.

If Rust semver demanded that all breaking changes must be semver-major, here are a few ways this could work:

Option 1: Nearly all API additions are semver-major. This obviously doesn't seem right.
Option 2: Glob imports are "last definition wins" (like in Python), or "first definition wins." I think this makes the problem worse, not better: now it's even less obvious which Foo is getting imported, and we're setting ourselves up for even worse compilation errors than otherwise.
Option 3: Glob imports are removed from the language, since they play a part in causing this problem. That also means no more prelude modules designed for glob-importing, harming the ergonomics of awesome crates like pyo3 and futures. This isn't good either.
Option 4: When we write code like &Foo, a tool (say, cargo or rustc) immediately replaces Foo with its fully-qualified name: in this case, first::Foo. Glob imports serve only to tell that tool where to look while rewriting our code. This solution has way too many moving pieces, and doesn't feel particularly ergonomic, either.

None of these options are good. Rust opted to go in another direction:

adding public items is semver-minor;
glob imports are discouraged, to minimize (but not prevent) breakage, and
maintainers of crates with prelude modules are encouraged to be mindful of what they add to the prelude, again to minimize but not prevent breakage.

A similar problem exists with trait methods: implementing a public trait for any existing type is also technically breaking, and is also explicitly defined as semver-minor despite the breakage. The RFC states that the breakage occurs if the breakage-causing trait already existed prior to being implemented. But in the RFC's example code, it's actually sufficient for the trait to make its way into crate B's scope. For example, crate B glob-importing of all of crate A's public items would cause the same breakage even if the conflicting trait did not previously exist.

Breakage of patterns is not always semver-major

Pattern-matching on structs is always allowed in Rust, even if the struct being matched has no visible fields: playground link.

// say this is in some other crate
pub mod other {
    pub struct Foo(i64);
}

fn process(value: &other::Foo) {
    // Foo's field is not visible here!
    // This `let` does nothing useful:
    // - it can't extract any fields, and
    // - can't learn anything else about `value`.
    let Foo { .. } = value;
}

Some kinds of changes to Foo can cause let Foo { .. } = value; to break. The RFC is unambiguous here: statements like let Foo { .. } = value serve no purpose other than to be broken if Foo changes, and its breakage is not sufficient to make this change semver-major. Again, the RFC's example for this case doesn't quite work as written: Rust has evolved in the nearly 8 years since that RFC was written. But its point stands regardless.

There are cases where the Foo { .. } pattern is useful to aid type inference, for example: if let Some(x @ Foo { .. }) = x.downcast_ref(). Thanks to this r/rust comment for this excellent example! However, those cases are specifically addressed in the RFC as well (original emphasis retained):

For example, changes that may require occasional type annotations or use of UFCS to disambiguate are not automatically "major" changes. [...] any breakage in a minor release must be very "shallow": it must always be possible to locally fix the problem through some kind of disambiguation that could have been done in advance (by using more explicit forms) or other annotation (like disabling a lint).

Principles of the policy, Rust API evolution RFC

This is why turning Chameleon from a struct into an enum in the last post did not require a new major version: the only breakage that could happen was in type inference or in a statement that did not serve any purpose. Barring some kind of exceptional situation (e.g., potential for ecosystem-wide breakage, definitely not the case here), the API evolution RFC explicitly disqualifies both of those categories from triggering a semver-major change.

Conclusion

Before reading this post, did you know that not all breaking changes require a new major version under Rust's semantic versioning principles?

Semver in Rust is hard for many reasons. There are a zillion strange ways to cause major breaking changes: example, another example. There's even "spooky action at a distance" where adding a field to a type can cause traits to silently stop being implemented for that type. And as we saw here, not all breaking changes are semver-major!

As if to prove my point, cargo-semver-checks was recently broken by a dependency crate's semver-incompatible (and now yanked) release. This is why installing with cargo install --locked is a good idea! Locked installs didn't break. Breaking semver is not shameful, and is not a sign of maintainers' carelessness, poor skill, or anything of the sort. It's just another language ergonomics problem solvable by better tooling.

This is the raison d'être for cargo-semver-checks.