[cfe-dev] making -ftrivial-auto-var-init=zero a first-class option
Kees Cook via cfe-dev
cfe-dev at lists.llvm.org
Tue Apr 21 14:20:44 PDT 2020
tl;dr: I'd like to revisit making -ftrivial-auto-var-init=zero an expressly
supported option. To do this, I think we need to either entirely remove
or rename it to something more directly reflecting the issue, like
This is currently open as https://bugs.llvm.org/show_bug.cgi?id=45497
Here is the situation: -ftrivial-auto-var-init=pattern is great for
debugging, but -ftrivial-auto-var-init=zero is needed for production
systems for mainly two reasons, each of which I will try to express context
1) performance and size
As measured by various Google folks across a few projects and in
various places, there's a fairly significant performance impact of
using pattern-init over zero-init. I can let other folks chime in
with their exact numbers, but I can at least share some measurements
Alexander Potapenko made with the Linux kernel (see "Performance costs"):
tl;dr: zero-init tended to be half the cost of pattern-init, though it
varied based on workload, and binary size impact fell over 95% going
from pattern-init to zero-init.
Another driving factor (see below from various vendors/projects), is the
security stance. Putting non-zero values into most variables types ends
up making them arguably more dangerous than if they were zero-filled.
Most notably, sizes and indexes and less likely to be used out of bounds
if they are zero-initialized. The same holds for bool values that tend
to indicate success instead of failing safe with a false value. While
pointers in the non-canonical range are nice, zero tends to be just
as good. There are certainly exceptions here, but the bulk of the
historical record on how "uninitialized" variables have been used in
real world exploitation involve their being non-zero, and analysis of
those bugs support that conclusion.
Various positions from vendors and projects:
Google (Android, Chrome OS)
Both Android and Chrome OS initially started using pattern-init, but due
to each of: the performance characteristics, the binary size changes, and
the less robust security stance, both projects have recently committed
to switching to zero-init.
I'm repeating what Joe Bialek has told me, so he can clarify if I'm not
representing this correctly... While not using Clang/LLVM, Microsoft is
part of the larger C/C++ ecosystem and has implemented both zero-init
(for production builds) and pattern-init (for debug builds) in their
compiler too. They also chose zero-init for production expressly due
to the security benefits.
Some details of their work:
Upstream Linux kernel
Linus Torvalds has directly stated that he wants zero-init:
"So I'd like the zeroing of local variables to be a native compiler
"This, btw, is why I also think that the "initialize with poison" is
pointless and wrong."
Unsurprisingly, I strongly agree. ;)
GrapheneOS is using zero-init (rather than patching Clang as it used to, to get
the same result):
There's been mostly silence on the entire topic of automatic variable
initialization, though there have been patches proposed in the past for
I can't speak meaningfully here, but I've heard rumors that they are
depending on zero-init as well. Perhaps someone there can clarify how
they are using these features?
So, while I understand the earlier objections to zero-init from a
"language fork" concern, I think this isn't a position that can really
stand up to the reality of how many projects are using the feature (even
via non-Clang compilers). Given that so much code is going to be built
using zero-init, what's the best way for Clang to adapt here? I would
prefer to just drop the -enable... option entirely, but I think just
renaming it would be fine too.
More information about the cfe-dev