cargo-semver-checks
v0.40 ships a massive upgrade to its system for detecting sealed traits. The new system is an all-around win-win: it improves the accuracy of a dozen existing lints, enables a new series of helpful lints, handles cyclic trait relationships, and is also faster than the old system. All that took a lot of work! Here's a look at how we made it happen.
We've previously described that Rust traits can be made impossible to implement ("sealed") via a range of techniques. Accurate detection of breaking changes requires that cargo-semver-checks
go to great lengths to determine whether a trait is sealed or not. We shipped an initial version of that functionality in September of last year. Since then, we've been hard at work on the next series of improvements which are now available in cargo-semver-checks
v0.40:
- detection of a novel way to seal traits
- detection of traits that cannot be implemented within their public API
- handling of cyclic sealing relationships with 100% fewer crashes At least, I hope it's 100% fewer! The longer I work on trait-sealing, the more astonishing edge cases I find — so we'll have to wait and see to be sure 👀
The best part? This new functionality comes at negative performance cost — cargo-semver-checks
is both smarter and faster than before! Of course, huge thanks go out to everyone making this work possible by funding my work!
- Brief refresher on trait sealing
- A novel way to seal traits
- Traits that cannot be implemented within public API
- Cyclic sealing relationships
- Making the performance cost negative
- Wrap up:
cargo-semver-checks
is smarter than ever
Brief refresher on trait sealing
If you are familiar with trait sealing, feel free to skip ahead to the next section.
"Sealing" a trait means making it impossible to implement anywhere except in its own crate.
Downstream crates may continue to use sealed traits normally in all other ways: using them in generic bounds, calling their methods, etc.
They merely cannot write impl upstream::SealedTrait for MyType
— doing so will always result in a compile error.
Trait sealing isn't (yet!) a built-in concept in the Rust language. There's an accepted but not-yet-implemented RFC to add a language-level construct for trait sealing. Sealing today relies on a set of community-designed techniques like the following:
// The `pub` items in this module cannot be imported
// since the module is private. They are usually called
// "pub-in-priv" items as a result.
mod private {
pub struct Token;
pub trait Sealed {}
}
// Technique #1:
//
// Implementing this trait requires implementing `Sealed` first.
// But other crates cannot *import* `Sealed`, so they cannot
// implement it and cannot implement this trait either.
pub trait SupertraitSealed: private::Sealed {}
// Technique #2:
//
// Implementing this trait requires implementing `method`.
// But `method`'s signature involves a type that other crates
// cannot name, which is impossible. That seals the trait too.
pub trait MethodSealed {
fn method(&self, token: private::Token);
}
These techniques accomplish the goal of sealing as an emergent property, without being explicit about it.
In other words, rustc
doesn't say "ah, I notice this trait is sealed so I shouldn't allow implementations."
Instead, attempted implementations in downstream crates "just happen to" always run into one compile error or another.
Since rustc
doesn't check for trait-sealing, cargo-semver-checks
must "become the compiler" and perform this analysis on its own. The rest of this post describes newly-shipped improvements to that analysis.
A novel way to seal traits
The "gadget" behind essentially all trait-sealing is requiring trait implementations to refer to the name of a "pub-in-priv" item — a pub
item placed inside a private module, rendering it non-importable & impossible to name.
Supertrait sealing uses a pub-in-priv supertrait; method sealing uses a pub-in-priv method argument type or return type.
It turns out there's another place where a trait impl
must name a type: in associated constants!
This produces a novel, never-before-described
To the best of my knowledge and belief. If you know of an earlier public reference to this technique, drop me a note and I'll link it here!
mechanism for trait sealing:
mod private {
pub struct Token;
}
pub trait ConstSealed {
const T: private::Token;
}
Downstream crates are required to set the associated constant. The constant's type is pub-in-priv, so naming it is impossible. But the type must be specified in trait implementations:
struct MyType;
impl upstream::ConstSealed for MyType {
// Attempting to omit the type name is an error:
// error[E0121]: the placeholder `_` is not allowed within
// types on item signatures for associated constants
const T: _ = todo!();
// ^
// |
// not allowed in type signatures
}
Hence the trait is sealed.
This technique is a elegant new way to seal traits without changing method signatures or repeating supertrait bounds.
Its downside is that it prevents the trait from being dyn
-compatible, since associated constants prevent dyn Trait
use.
This limitation of associated constants doesn't strike me as fundamental to Rust — I believe they could be made compatible with dyn Trait
in the future.
As of cargo-semver-checks v0.40
, traits sealed in this way are accurately classified as sealed.
Traits that cannot be implemented within public API
The #[doc(hidden)]
attribute allows Rust items to remain pub
without being public API.
This technique is common in macro-heavy crates: for example, a derive macro would generate code that lives in the downstream crate, but may need to access upstream internals that are not meant for public use.
Could we exclude a trait's impl
ability out of the public API, while similarly allowing it for macro-generated code? Yes, of course!
For clarity, we'll call such traits "public API sealed," as opposed to "unconditionally sealed" for "regular" trait sealing.
The general rule is this: if the impl
requires using any #[doc(hidden)]
items, that impl
relies on non-public API and is exempted from any SemVer stability guarantees. We can choose between many ways to accomplish this.
One technique is to directly mark any of the required trait items (ones without a default value) as #[doc(hidden)]
, forcing impl
blocks to directly reference non-public API:
pub trait HiddenType {
// The `impl` has to set the associated type,
// thereby touching non-public API.
#[doc(hidden)]
type T;
}
pub trait HiddenConst {
// The `impl` has to set the associated const,
// thereby touching non-public API.
#[doc(hidden)]
const N: i64;
}
pub trait HiddenMethod {
// The `impl` has to implement this method,
// thereby touching non-public API.
#[doc(hidden)]
fn method(&self);
}
We can also reuse the same tricks as for unconditionally sealing traits:
// This time, the helper module is `pub`!
pub mod internals {
#[doc(hidden)]
pub trait Super;
#[doc(hidden)]
pub struct Token;
}
// `impl`-ing this trait requires `impl`-ing
// a `#[doc(hidden)]` supertrait first.
// Not public API!
pub trait SuperHidden: internals::Super {}
pub trait HiddenArg {
// `internals::Token` is `#[doc(hidden)]`
// so this method signature uses non-public-API.
fn method(&self, token: internals::Token);
}
pub trait HiddenConst {
// Naming the const's type forces
// the use of non-public-API.
const N: internals::Token;
}
Just like unconditionally sealed traits, public API sealing suffers the same edge cases!
For example, the following trait might appear public API sealed, but it's actually unsealed:
#[doc(hidden)]
pub trait Super;
// Note the non-public-API supertrait.
// Naively, it looks public API sealed.
pub trait AppearsSealed: Super {}
// This blanket impl unseals the `AppearsSealed` trait!
// Now, downstream crates don't need to `impl`
// `Super` by themselves — they get it for free.
//
// Downstream code can then directly write
// `impl upstream::AppearsSealed for MyType`
// which doesn't reference non-public-API items.
impl<T> Super for T {}
Getting these edge cases right is critical! Failing to do so results in a poor user experience for cargo-semver-checks
users: false-positive lints, failures to detect breaking changes, possibly even outright crashes. For example, the zerocopy
and diesel
crates both reported false-positive breakage reports around public API sealed traits.
All those cases are now properly handled in our new release!
Cyclic sealing relationships
An exotic edge case related to trait sealing could cause prior cargo-semver-checks
versions to go into infinite recursion and crash. This is now fixed in v0.40
!
I'd like to walk you through this edge case, as a practical example of why getting trait-sealing analysis right is so difficult. When you wonder where your GitHub Sponsors funding is going, please think of this example and remember that this is just the tip of the complexity iceberg 😄
A naive attempt at a cycle
Determining whether a trait is sealed requires checking its supertraits, so one's first thought at causing a cycle might be to declare two traits that name each other as a supertrait:
pub trait X: Y {}
pub trait Y: X {}
Fortunately, Rust doesn't allow this! Interestingly, there exist some reasonable use cases for cyclic traits. However, the Rust project made an explicit decision to not support them at this time.
error[E0391]: cycle detected when computing the super predicates of `X`
--> src/lib.rs:1:14
|
1 | pub trait X: Y {}
| ^
|
note: ...which requires computing the super predicates of `Y`...
--> src/lib.rs:3:14
|
3 | pub trait Y: X {}
| ^
= note: ...which again requires computing the super predicates of `X`,
completing the cycle
If we can't cause a cycle through the trait itself, let's try to (ab)use the rest of the sealing rules!
Recap: Sealing rules
30s summary of the rules for sealing a trait with another trait. Regular readers may remember our post describing the rules in detail and may want to skip ahead.
Consider a trait definition like the following:
pub trait MaybeSealed: Super {}
Whether MaybeSealed
is sealed or not depends on Super
. Briefly, it's sealed if both:
Super
is sealed.- Downstream types cannot get a
Super
implementation via an existing blanketimpl
.
The latter condition is not obvious, but it's necessary!
For example, an item like the following would unseal MaybeSealed
:
impl<T> Super for T {}
Here, downstream types would gain a "free" Super
implementation by matching the T
in that impl<T> Super for T
.
Even if Super
were otherwise sealed,
It actually is sealed here, because the blanket impl
would conflict with any downstream implementation. But even so, MaybeSealed
is still unsealed by this.
that will not prevent downstream code from including impl MaybeSealed for MyType
— all preconditions for that impl
would be satisfied.
What if the T
in the blanket impl
is bounded by other traits, though?
Using trait bounds to create a cycle
Say the blanket implementation is actually:
impl<T: SomeTrait> Super for T {}
Now, we don't yet know if property B. in the sealing rules holds!
We have to evaluate SomeTrait
itself on properties A. and B. first.
This can get a bit involved, and not all those details are relevant!
For the full evaluation process, check out our earlier post on the topic.
What if instead of a bound on a trait like SomeTrait
, we use a bound based on MaybeSealed
itself?! Our code would say:
mod private {
pub trait Super {}
}
pub trait MaybeSealed: private::Super {}
impl<T: MaybeSealed> private::Super for T {}
Recall that for MaybeSealed
to be sealed, both of the following need to apply:
Super
needs to be sealed — and it is, since it's inside a private module.- The blanket
impl
ofSuper
must not be able to apply to a downstream type. So ... can it?
The blanket impl
provides Super
implementations for any types that implement MaybeSealed
. Then:
- If a type already implements
MaybeSealed
, then it gets aSuper
implementation and is thus able to implementMaybeSealed
. - If it doesn't already implement
MaybeSealed
, then it can't get aSuper
implementation, and cannot implementMaybeSealed
.
😵 If your head is spinning, you're in good company!
It took me a week to untangle all this in my own mind — especially with all the additional edge cases around #[doc(hidden)]
.
The real-world edge case that caused the crash was even more involved, even as a minimal repro.
Bottom line: this is a chicken-and-egg problem. Its answer is that all traits in the cycle are sealed. Causing unsealing for a downstream type via a blanket impl requires that the type (or references to the type, etc.) "enter" the cycle by implementing one of the traits first, which is prevented by the rest of the cycle.
To implement this, we need to do cycle-detection in the graph of trait relationships. But cycle-detection in arbitrary graphs is expensive! Goodbye, performance 🙋♂️ ... or so it seemed at first!
Making the performance cost negative
Zooming out: we just added a lot of complexity to our trait analysis code!
- We now detect another kind of unconditional trait sealing.
- We detect public API sealed traits, with all their edge cases.
- We track cycles — twice: once for unconditional sealing, and once for public API sealing.
How do we make all this fit within the previous performance budget? As usual: memoize to avoid repeated work, and take advantage of special cases in the data.
First, we added a bit flags index recording key properties of each Rust item (a struct, trait, etc.):
- whether the item is publicly importable:
pub
, but not pub-in-priv - whether the item is public API: deprecated, or not
#[doc(hidden)]
- if the item is a trait: whether it is unconditionally sealed or public API sealed, and whether it has blanket impls.
Having pre-computed access to pub
and public API information for each item makes sealed trait analysis faster: checking whether e.g. method parameters or const types can be imported is now a hashtable lookup and simple bitwise operation.
Memoizing trait sealing status also makes subsequent uses of this data faster. Previously, this data was computed on-demand and not cached, meaning that each trait-related lint computed the sealing status of each trait in the entire crate. Now, we only compute trait sealing state once when we load the crate, then reuse the info each time a lint needs it.
Second, we observed a pattern in our inputs: most crates don't have cyclic structure, so cycle-detection isn't necessary most of the time. A common saying in performance engineering is "make the common case fast, and the uncommon case merely correct." Following the motto, we engineered the code to avoid the cost of cycle-detection whenever possible:
- A cycle requires a trait that depends on a supertrait with a blanket impl — a rare case! Record all such traits and consider them undecided.
- Attempt to resolve the sealing status of all traits through other means first: for example, if a trait is unconditionally sealed via another mechanism and separately may also be part of a cycle, the cycle is irrelevant. This may remove traits from the undecided set — that set ends up empty here for the vast majority of crates.
- If there are still undecided traits after exhausting all other means, run cycle-detection. This is expensive, but very few crates need it!
This high-level explanation is meant to be illustrative, not exhaustive — the real implementation has even more nuances. Interested readers can find all the details (with plenty of code comments!) in the pull request.
Wrap-up: cargo-semver-checks
is smarter than ever
The mission of cargo-semver-checks
is fearless package management.
Publishing a new version of your library should be cause for celebration! You shouldn't have to worry about whether you accidentally shipped a breaking change, and you certainly shouldn't hear about breaking changes via a GitHub issue from a frustrated downstream user.
As a user, upgrading non-major versions of dependencies should be a non-event.
cargo update
should succeed without errors 100% of the time. Not 99%, not 99.9%, but every single time for everyone in the Rust ecosystem.
We aren't there yet, and there's quite some way left to go.
But we're definitely on the right path, and cargo-semver-checks v0.40
is one big step forward.
If you liked this essay, consider subscribing to my blog or following me on Mastodon, Bluesky, or Twitter/X. You can also fund my writing and work on cargo-semver-checks
via GitHub Sponsors, for which I'd be most grateful ❤
Discuss on r/rust.