Breaking semver in Rust by adding a private type, or by adding an import

May 08, 2023 semver rust

A few days ago, I started polls on Mastodon and Twitter whether adding a new private type, or an import, can ever be a major breaking change. The consensus was that this should be impossible.

I agree with that. It should be impossible.

I've discovered a way to cause a previously-public type or function to disappear from a crate's public API by making innocuous-seeming changes like adding a private type or adding an import, etc. It is not a hypothetical problem, either — I've found at least one real-world Rust project that has been affected by it.

While I'm excited to have discovered this, I don't think the problem is actually that severe. It isn't some fundamental unsoundness in the language! Most type and import additions are still safe and non-breaking. The problem requires "the stars to align" before it happens: a glob re-export needs to exist adjacent to a locally-defined item with a conflicting name.

Rust has overcome many worse sharp edges than this one. We'll address this one too.

In this post, I'll show how to cause this problem, explain why this is a breaking change, discuss the implications, and consider ways to address the problem. But I want to emphasize two key points first:

None of this post is meant as criticism — quite the contrary! Being able to productively discuss issues and act on them is a testament to the Rust ecosystem and the people that power it. When I've raised similar Rust issues in the past the response has been nothing but positive, to the benefit of the language and the ecosystem as a whole. I've had far worse experiences raising much smaller issues in other ecosystems.
My goal here is to highlight a problem and help solve it, not claim credit. I'm probably not the first person to notice this issue. But after several years writing Rust and ~9 months of working on semver specifically, running into it was a surprise for both me and the many experienced Rustaceans I spoke with. I also cannot point to any existing mention of downstream implications on things like rustdoc JSON, which you'll see are significant. I'll show some of the prior art in the rest of the post.

Finally, I want to congratulate the folks that answered "yes" to my polls and sent me valid answers: James Munns from OneVariable, ibraheemdev, David Koloski, llogiq, Jack Wrenn, and Robert Balicki, as well as Erk who found an entirely different way to accomplish the goal. Well done 🏆

Onward!

Discuss on r/rust or lobste.rs. Subscribe to my blog.

Breaking public APIs by adding a new import or private type

There are many equivalent ways to trigger this problem, and they all rely on the same kind of "gadget": shadowing a glob re-exported name with a local definition, like a new type or use import statement.

Here we break the downstream crate by adding struct Foo upstream: (playground)

// Imagine this is an upstream crate.
mod dependency {
    mod inner {
        pub struct Foo(i64);

        impl Foo {
            pub fn new(value: i64) -> Self {
                Self(value)
            }
        }
    }

    pub use inner::*;

    // Adding this line prevents `dependency::inner::Foo`
    // from being exported as `dependency::Foo`.
    // *** THE LINE BELOW IS A MAJOR BREAKING CHANGE ***
    // struct Foo;
}

// Imagine this is a crate that depends on `dependency`.
mod downstream {
    fn proof() {
        let _ = crate::dependency::Foo::new(42);
    }
}

Uncommenting struct Foo triggers compilation errors:

error[E0603]: struct `Foo` is private
  --> src/lib.rs:26:36
   |
26 |         let _ = crate::dependency::Foo::new(42);
   |                                    ^^^ private struct
   |
note: the struct `Foo` is defined here
  --> src/lib.rs:20:5
   |
20 |     struct Foo;
   |     ^^^^^^^^^^^

Here we break the downstream crate by adding a use other::Foo import upstream: (playground)

// Imagine this is an upstream crate.
mod dependency {
    mod inner {
        pub struct Foo(i64);

        impl Foo {
            pub fn new(value: i64) -> Self {
                Self(value)
            }
        }
    }

    mod other {
        struct Foo;
    }

    pub use inner::*;

    // Adding this line prevents `dependency::inner::Foo`
    // from being exported as `dependency::Foo`.
    // *** THE LINE BELOW IS A MAJOR BREAKING CHANGE ***
    // use other::Foo;
}

// Imagine this is a crate that depends on `dependency`.
mod downstream {
    fn proof() {
        let _ = crate::dependency::Foo::new(42);
    }
}

Uncommenting use other::Foo triggers similar compilation errors:

error[E0603]: struct `Foo` is private
  --> src/lib.rs:22:16
   |
22 |     use other::Foo;
   |                ^^^ private struct
   |
note: the struct `Foo` is defined here
  --> src/lib.rs:14:9
   |
14 |         struct Foo;
   |         ^^^^^^^^^^^

But we never touched the code that defined the public Foo, so why is it broken? And why is the error pointing at the private struct Foo when the public Foo is right there and still being re-exported? 🚩

Why is everything broken?

Short answer: the name Foo is no longer part of the crate's public API. It is no longer re-exported nor usable in any way at all. Any code that depended on it is now broken — there is no workaround. In semver terms, this is a major change.

The pub use inner::* glob re-export is still there, and Foo is still inside inner::*. But a quirk of Rust's name resolution rules make the glob no longer import Foo when we add struct Foo or use other::Foo next to it.

What's going on? A brief intro to Rust name resolution

Glob imports like use inner::* have a complex ergonomics story. They are also complex to implement! They've been hell to get right in cargo-semver-checks, and I've heard that they weren't exactly a walk in the park in rustc or rust-analyzer either. They present a conundrum: what should happen when two items with the same name are present in the same scope?

Without glob imports, the answer is easy — it's a compile error. You can't define struct Foo next to a different struct Foo. You also can't have struct Foo next to use other::Foo. Since both of these refer to the name Foo explicitly, the problem is local to the file where these lines exist. This makes it easy to see and fix, or avoid entirely. Problem solved.

With glob imports, the situation is more nuanced.

We don't want to make every name collision an immediate compile error. In that case, adding items would become even more of a semver hazard than it already is. Adding items can already be a breaking change due to a complex interaction between glob imports, traits, and the associated items on those traits. This is one of the few breaking changes that Rust explicitly does not consider semver-major. If every name collision were an immediate compilation error, then we don't need the complex trait interaction to cause breakage — just a name collision would be enough. As a knock-on effect, adding items to prelude modules would also be even more fraught due to the risk of name collisions. This would be so frustrating that it would be more ergonomic to completely remove glob imports from the language instead. This has been proposed! At this point, it's safe to say that it won't be happening, so instead we'll need to settle for "glob imports considered harmful" (or at least, risky).

Instead of immediately causing a compile error, Rust recognizes two different cases and handles each in its own way. Each case also comes with its own semver hazard!

Both colliding names come from globs

Say a module imports both a::* and b::*, where both a and b contain a Foo.

Here, Rust ignores the collision until the name is used.

When the name is used, rustc emits an "ambiguous item" compilation error since it can't figure out how to resolve the name. If the name isn't used, then there's no need to resolve the ambiguity and Rust won't report a problem.

I noticed a semver hazard here three months ago: (playground)

// Upstream crate
mod upstream {
    mod a {
        pub struct Foo {}
    }

    mod b {
        // Uncomment the following line to break downstream.
        //
        // pub struct Foo {}
        //
        // Uncommenting the line above breaks
        // all downstream uses of `upstream::Foo`:
        // - the name is ambiguous,
        // - ambiguous names only error if used,
        // - currently there's no lint or warning for this.
        //
        // Unless `upstream` has tests that
        // *specifically* use `upstream::Foo`, this bug
        // only gets discovered after breaking downstream.
    }

    pub use a::*;
    pub use b::*;
}

// Downstream crate
mod downstream {
    fn proof() {
        let _ = crate::upstream::Foo {};
    }
}

Adding b::Foo causes an ambiguous name error:

error[E0659]: `Foo` is ambiguous
  --> src/lib.rs:32:34
   |
32 |         let _ = crate::upstream::Foo {};
   |                                  ^^^ ambiguous name
   |
   = note: ambiguous because of multiple glob imports of a name in the same module

This is quite unfortunate. Crates rarely (never?) test that all their re-exported names are unambiguous, so this has excellent odds of being discovered only after publishing a broken crate version.

When I discovered this, I made the case that it's a footgun. The Rust maintainers agreed! A lint for this will ship in Rust 1.70 😎

One of the sources is a glob, and the other is not

In this case Rust gives precedence to the non-glob name, pretending as if the glob did not include the colliding name.

This is called shadowing and has been implemented and codified in an RFC for over 7 years now. Both local definitions like struct Foo; and explicit imports like use some::Foo; are allowed to shadow glob imports.

The semver hazard here is extremely obvious in retrospect (as is often the case!) — just shadow items in the glob re-export! This might have been noticed as part of the discussion of this issue 7+ years ago. But AFAICT it never made it into any semver hazards document or blog post I read, and over the last year I've read ~everything I could find on the topic. Incidentally, Jack Wrenn's Semver Snares is an excellent series if you enjoy reading using cursed Rust to cause breaking changes!

Anything that declares a name would work as a semver hazard via shadowing. So long as both items are in the same namespace. There's no collision and no shadowing between values (including functions) and types. To shadow a type named Foo, anything like struct Foo; or enum Foo { ... }; or use other::Foo; would work. To shadow a function, const, or static named BAR, just define another function, const, or static named BAR. Whether the shadowing item is public or private doesn't matter — either way it's a major breaking change, and only changes the exact compilation error. There's a funny edge case in the breaking change here: you can use an import of a type to break its own public glob re-export from the same scope. Check it out in the playground.

Surprisingly, we can also use this trick to shadow entire modules even though Rust RFC 116 seems to explicitly prohibit that! I'm not sure if I'm just misreading RFC 116, or if this is an accidental regression or an explicit decision to reverse part or all of the RFC. UPDATE: RFC 116 has been superseded. (playground)

// Upstream crate
mod upstream {
    mod a {
        pub mod collide {
            pub struct Foo {}
        }
    }

    pub use a::*;

    // Uncomment the below to break `downstream`.
    //
    // pub mod collide {
    //     pub struct Bar {}
    // }
}

// Downstream crate
mod downstream {
    fn proof() {
        let _ = crate::upstream::collide::Foo {};
    }
}

Uncommenting the colliding pub mod collide block causes:

error[E0422]: cannot find struct, variant or union type `Foo` in module `crate::upstream::collide`
  --> src/lib.rs:23:43
   |
23 |         let _ = crate::upstream::collide::Foo {};
   |                                           ^^^ not found in `crate::upstream::collide`
   |
note: struct `crate::upstream::a::collide::Foo` exists but is inaccessible
  --> src/lib.rs:7:13
   |
7  |             pub struct Foo {}
   |             ^^^^^^^^^^^^^^ not accessible

This is not just a hypothetical issue, and should have a lint

Sadly, this is not a hypothetical issue anymore. Around a year ago, the opencl3 crate seems to have suffered a regression when a large number of its items were accidentally shadowed. Neither code review nor tests seem to have been effective at preventing the regression. The fixing PR adds a test for the public re-export of one of the affected items.

This is further evidence that "shadowing a glob's publicly re-exported item" in general is a footgun, not a feature. It's valid Rust, but can cause clear and unambiguous harm while having no real-world use case. The only thing such shadowing accomplishes is to make an item externally unnamable in a convoluted way. Users that need to make a type unnamable are better off following the standard trick used in sealed traits.

I think this case deserves a rustc or clippy lint, similarly to how ambiguous re-exports from multiple globs is becoming a lint. I've opened an issue for this here.

cargo-semver-checks will also catch and prevent problems like this. As soon as I finish fixing the underlying glob import resolution issue that made me start pulling on this thread in the first place, cargo-semver-checks will implement the best-available check supported by rustdoc today. The full fix will require some work in rustdoc itself, as you'll see in a later section of this post. Between the gigantic optimizations in the most recent version (109x faster in tokio, 2354x faster in aws-sdk-ec2) and the extra ~2x speedup from the rustdoc caching implemented the new v2 release of the cargo-semver-checks GitHub Action, it's never been faster or more convenient to prevent semver violations in your crates.

Unfortunately, the impact doesn't end here.

Fixing rustdoc JSON will require a format change

Tools like cargo-semver-checks use rustdoc JSON to examine crates' APIs. The findings presented in this post mean that rustdoc JSON consumers may not be able to correctly determine which names and types are part of a crate's public API — they may incorrectly believe that shadowed items are still part of the API!

There are two takeaways here:

With default rustdoc settings, none of the shadowing examples presented in this post are detectable via rustdoc JSON.
A partial workaround exists, but does not fully resolve the issue — the rustdoc JSON format will likely need to change to address the problem.

Let's merge our earlier examples and look at them together:

mod inner {
    pub struct Foo {}
}

mod other {
    struct Foo;
}

pub use inner::*;

// Adding either of the following two lines
// would shadow `inner::Foo` and
// hide the name `Foo` from the public API.
//
// (1)
// struct Foo;
//
// (2)
// use other::Foo;

As decribed earlier in the post, uncommenting either (1) or (2) is a major breaking change.

The rustdoc JSON format does not explicitly specify which names are exported by a given module, or by the crate as a whole. One reason for this decision is that modules may export infinitely many names, in all sorts of non-obvious and cursed ways. This isn't just a hypothetical case: it's used in crates that are among the top 1000 most-downloaded on crates.io. All these edge cases are correctly handled by cargo-semver-checks — if they weren't, the tool would either go into an infinite loop or crash after overflowing the stack. Instead, rustdoc JSON exposes pub use items, including globs, and relies on the downstream consumer to resolve which names are accessible through them.

Consider the case where we uncomment line (1) in the example above. Detecting that inner::* does not include Foo requires knowledge of the newly added struct Foo; type at (1).

However, the current rustdoc JSON implementation by default omits items that are #[doc(hidden)] or private! With default settings, the same rustdoc JSON is generated whether or not line (1) is included. Rustdoc JSON consumers cannot tell whether any glob-imported items are shadowed.

Fortunately, rustdoc includes two flags that cause hidden and private items to be included in the generated JSON file: --document-private-items and --document-hidden-items. Rustdoc JSON generated with these flags would include the definition of struct Foo; at line (1), which is sufficient to discover the shadowing.

Unfortunately, this is only a partial workaround.

Instead of line (1), let's use line (2) and shadow Foo with a local import: use other::Foo;. Even with the flags for private and hidden items, rustdoc does not include non-pub imports in the generated JSON file. The generated JSON is exactly the same both with and without line (2), again making it impossible to detect that shadowing occurred.

I have not been able to find a workaround for this issue with the current version of rustdoc. I've already been in touch with the folks working on rustdoc, and I look forward to working with them on a solution!

I suspect the fix will probably require a change in the rustdoc JSON format, such as including the shadowed names in glob import items. The rustdoc JSON format is unstable, so its users are hopefully prepared for such a change. The change will be painless for cargo-semver-checks since it uses the Trustfall query engine to isolate itself from changes in the underlying data representation. I hope all other rustdoc JSON consumers are similarly fortunate.

Conclusion

All this started because I had to debug a false-positive report in cargo-semver-checks. "Hmm that link mentions a 'semver-crater', what's that?" Check out this blog post!

By the time I got to the bottom of the rabbit hole, I had filed three GitHub issues on various Rust components:

I also wrote quite a bit of extremely cursed Rust code that will be used to test trustfall-rustdoc-adapter, which is the Trustfall plugin that lets cargo-semver-checks run declarative query-like lints over rustdoc JSON. You can see some of the most-cursed examples in these Mastodon and Twitter threads.

The problem described in this blog post has happened in the real world. It's somewhat easy to cause accidentally, and we have every reason to believe it won't be caught by code review or CI. It can cause quite a bit of pain for maintainers and users alike: broken builds, GitHub issues, yanked releases, emergency fixes, etc.

Let's make our tools catch and prevent it!

Soon, I'll roll out an updated cargo-semver-checks that will partially protect against this problem, subject to the rustdoc JSON limitations I described. Whenever rustdoc allows detection of import-based shadowing, cargo-semver-checks will fully protect against this kind of breakage. I look forward to working with the rustdoc team on this!

Go team!

Thanks to Jynn Nelson, Luca Palmieri, arriven, and Jeremy Kun for their feedback on drafts of this post. All mistakes are mine alone.

Discuss on r/rust or lobste.rs. Subscribe to my blog.