[cfe-dev] Proposal: Managing ABI changes in libc++

Fri Dec 19 18:28:01 PST 2014

First, sorry for delays. I do owe you feedback here though, and then I'll
go look at the patch. =]

On Mon, Dec 8, 2014 at 7:42 AM, Marshall Clow <mclow.lists at gmail.com> wrote:

> In general, we try to avoid making changes to the ABI for libc++.
> ABI changes can lead to subtle, hard to find bugs, when part of a piece of
> software (a dylib or static library, say) is build to the old ABI, and the
> rest to the new ABI. People have been burned in the past by inadvertent
> changes to the libc++  ABI. (not to be confused with the libc++abi project)
>
> Eric Fiselier has been working on a tool to detect ABI changes, so that
> (hopefully) all future changes will be intentional.
>
> ABI-breaking changes can include things like:
>         * Changes to structures (sizes, layout)
>         * Addition/removal of virtual functions (vtable layouts)
>         * Changes to template parameters (addition, removal)
>
> Also, there are times that a change to the standard will mandate an ABI
> change. I tend to argue against those in the committee meetings, but I
> don’t always get my way.
>
> In the LLVM community, there are two differing opinions about changes in
> lib++ that are ABI-breaking. Broadly speaking:
>
> a) There are the people who ship libc++ in production systems, who say:
> Whoa! Don’t do that! Ever! (or at least “let us decide when”).
>
> b) There are the people who use libc++ internally, who say: Is it faster?
> Does it work better? Do it!
>

(FWIW, there are also people that support users in both camps (a) and (b).
I'm one of those.)

>
> === Proposal ===
>
> Goals:
> 1) Make the default be “ABI is stable” (modulo changes in the C++ standard)
> 2) Make it possible for people to propose (and use) ABI-breaking changes
> for libc++, and have them live in tree.
> Note: This would make it possible, not trivial. We still want to avoid
> gratuitously changing the ABI.
>
> Concrete steps:
> 1) Give each ABI-breaking change its own "enabling macro”, that starts
> with “_LIBCPP_ABI_”
>
> We have an example of this today. There are two different
> std::basic_string layouts defined in <string>, and the
> second (ABI changing) one is controlled by the macro
> _LIBCPP_ALTERNATE_STRING_LAYOUT
>
> Under my proposal, I would change this to
> _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT, and keep the old name as a synonym.
>
> 2) Create a global macro “_LIBCPP_ABI_UNSTABLE” which, when defined, turns
> on ALL of the _LIBCPP_ABI_* changes.
>
> Adding a new, ABI-incompatible change to the library would consist of:
> * Choosing an enabling macro name of the form _LIBCPP_ABI_XXXXXXX
> * Wrapping the code in #ifdef _LIBCPP_ABI_XXXXXXX
> * Enabling the macro if _LIBCPP_ABI_UNSTABLE is defined.
>
> I think that this convention will make it possible both camps ((a) and (b)
> above) to coexist in the same code base.
>
> Comments?
>

As far as this goes, I'm 100% in favor.

I think there are two more ABI concerns that we should really figure out a
plan for now in order to ensure they fit cohesively with the whole.

1) I think we need a way to more quickly roll standard-mandated or bug-fix
ABI breaks into something much more stable than "unstable".
2) I think we need to figure out how to maintain at least two stable ABIs
at the same time.

I'll expand a bit below.

A terminology point, when I say a "minor" or "major" ABI break, I am not
classifying the *nature* of the break, but the *scope*. Changing the layout
of std::string has a radically different scope in its impact than fixing
the return type of a infrequently used function for example.

For (1), let's consider two cases.
1.1) We introduce an ABI-significant bug and need to fix it. What do we do?
This is exacerbated when the bug has shipped to customers. Some users of
libc++ update *very* rapidly, and so even with a very narrow window of
fallout in-tree, it would be advantageous in my opinion to have a
non-silent way to fix these issues. Note that this only really applies to
minor ABI breaks. A massive breaking change would be sufficiently
disruptive to warrant more extreme measures and I certainly hope i
1.2) The standard changes in some *minor* way that necessitates an ABI
break. Note that I'm not talking about "C++1z requires a whole new ABI"
kind of break, I'm talking about the standard equivalent to 1.1 -- we ship
a bug, we fix it, but it requires some small ABI break.

In both of these cases, I think the right thing for the default
configuration of libc++ is to make the change and take the ABI break. I
think we should be standard conforming and correct above all else out of
the box. But I think it is important to provide some mechanism to opt *out*
of such changes to the ABI, at least in order to control when they arrive
in systems that are very susceptible to ABI fallout.

For (2), my motivation is to chart out a path forward, likely measured in
years if not decades. This would include the ability to follow any massive
upheaval in the standard's ABI, as well as the ability to pick up
improvements which go in under the unstable bucket after a resonable
interval and in a way that platform vendors are comfortable with. I picked
"two" specifically for a reason. I think we should be able to create a new
stable major ABI without changing the default while customers test and
evaluate it, etc. Then we should be able to switch the default at some
point without ever touching the old ABI. Finally, after some lengthy period
(likely also measured in years if not decades) we should be able to remove
the old ABI and start the process again. When Howard first discussed the
time scale at which this kind of breaking change could possibly be
acceptable to customers, he used decades. I'm echoing that, as it matches
my experience with customers that have hard ABI requirements.

So, here is my initial proposal for how to handle the above two issues.

First, extend the ABI definitions to include what they already (somewhat)
do: versions. Specifically, both minor and major versions to handle the
above two use cases respectively. We already sort-of have this for the
stable ABI, I just think we should formalize it, document it, and
incorporate it into the macro naming convention.

The resulting pattern would be that when a bug-fix or minor
standards-motivated change is introduced which breaks the ABI, it too is
guarded behind an ABI macro, but that macro is by-default enabled. A new
high-level macro (i'm trying to avoid picking names here, I suspect
Marshall will pick better ones than I will) would be introduced to restrict
libc++ to the prior minor ABI version, and that macro would disable the
bug-fix. If at some point a contributor wants to build a new major version
stable ABI out of a sub set of the unstable changes in the tree (and
everyone agrees that is reasonable to do), then the expected new high-level
macros for those versions would be introduced, and the per-feature
abi-breaking macros would be flipped based on them.

Does this make sense to folks?

One goal I have throughout this is that we *have* per-feature ABI-break
macros, but that users essentially never need to use them. I would like to
have something closer to version numbers that they interact with in order
to request specific sets of features that can be documented together.

Anyways, my slightly-more-than-2-cents on the overarching proposal. Thanks
for working on this Marshall. I know I've been clamoring for it without
working on it for quite some time. =]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141219/21e7503e/attachment.html>