[cfe-dev] [EXTERNAL] Re: making -ftrivial-auto-var-init=zero a first-class option

Tue Apr 21 16:34:52 PDT 2020

To add in, we (Microsoft) currently use zero initialization technology in Visual Studio in a large amount of production code we ship to customers (all kernel components, a number of user-mode components). This code is both C and C++.

We already have had multiple vulnerabilities killed because we shipped this technology in production. We received bug reports with repros that worked on older versions of Windows without the mitigation and new versions of Windows that do have it. The new versions don't repro, the old ones do.

Using this sort of technology only in development (and not in production) is not sufficient. Some of these bugs will never produce crashes, the uninitialized data is copied across a trust boundary (i.e. from the kernel in to a untrusted user-mode process). This will never result in a crash but does result in a security issue. This is why shipping in production is a requirement even if you had perfect test coverage that exercises all code paths (which nobody has).

We do enable pattern initialization for debug builds and internal retail builds (using a developer mode in the build environment). We do this to help prevent "forking of the language" and also to force non-determinism. If your code relies on the zero-init, it won't work when we do pattern init. If your code only works with a non-zero value but doesn't care what that value is (Booleans, certain bit tests, etc.), it won't work with zero-init. Developers cannot depend on the automatic initialization for program correctness.

Joe

-----Original Message-----
From: Kees Cook <keescook at chromium.org> 
Sent: Tuesday, April 21, 2020 4:20 PM
To: Hubert Tong <hubert.reinterpretcast at gmail.com>
Cc: Clang Dev <cfe-dev at lists.llvm.org>; Joe Bialek <jobialek at microsoft.com>
Subject: [EXTERNAL] Re: [cfe-dev] making -ftrivial-auto-var-init=zero a first-class option

On Tue, Apr 21, 2020 at 06:29:07PM -0400, Hubert Tong wrote:
> On Tue, Apr 21, 2020 at 5:20 PM Kees Cook via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcl
> > angbuiltlinux.github.io%2FCBL-meetup-2020-slides%2Fglider%2FFighting
> > _uninitialized_memory_%2540_CBL_Meetup_2020.pdf&data=02%7C01%7Cj
> > obialek%40microsoft.com%7C4bce6c76554b4dcf4c2b08d7e64a7848%7C72f988b
> > f86f141af91ab2d7cd011db47%7C1%7C0%7C637231080235713420&sdata=uMZ
> > PAiQvnjfLxORQNOAdUHGLY8czk8Mlsxc8dXpLSYg%3D&reserved=0
> > tl;dr: zero-init tended to be half the cost of pattern-init, though 
> > it varied based on workload, and binary size impact fell over 95% 
> > going from pattern-init to zero-init.
> >
> This does not seem to indicate why zero-init is preferred over a 
> default of using no explicit policy in production.

Do you mean "leaving things uninitialized" when you say "no explicit policy"? Maybe I've misunderstood. Google's goal of using auto-init is to eliminate uninitialized variables in production as a security defense. When examining zero-init vs pattern-init, there is a clear advantage on performance and size for zero-init.

> > as good. There are certainly exceptions here, but the bulk of the 
> > historical record on how "uninitialized" variables have been used in
> >
> Maybe an explanation of the scare quotes around "uninitialized" would 
> help clarify your position.

Ah, sorry, I always use quotes (they are not intended to scare but to
clarify) when discussing uninitialized variables in real-world contexts, because they are, of course, not uninitialized in the sense of them not having a value. The RAM contents have a value. Many people without compiler backgrounds think of such variables as being uncontrollable or meaningless, when in fact they are usually highly controllable by an attacker, etc.

> > Google (Android, Chrome OS)
> >
> > Both Android and Chrome OS initially started using pattern-init, but 
> > due to each of: the performance characteristics, the binary size 
> > changes, and the less robust security stance, both projects have 
> > recently committed to switching to zero-init.
> >
> I'm not sure that this is clear in terms of whether the statements 
> apply to debug/development or production. I don't think pattern-init 
> is meant to be a tool for production builds, which leads me to think 
> that the above statement is about debug builds, at which point I'm 
> thinking that using zero-init only serves to hide problems.

The context for Google's use of zero-init was meant here to be about production builds.

> > Upstream Linux kernel
> >
> > Linus Torvalds has directly stated that he wants zero-init:
> > "So I'd like the zeroing of local variables to be a native compiler 
> > option..."
> > "This, btw, is why I also think that the "initialize with poison" is 
> > pointless and wrong."
> >
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flo
> > re.kernel.org%2Flkml%2FCAHk-%3DwgTM%2BcN7zyUZacGQDv3DuuoA4LORNPWgb1Y
> > _Z1p4iedNQ%40mail.gmail.com%2F&data=02%7C01%7Cjobialek%40microso
> > ft.com%7C4bce6c76554b4dcf4c2b08d7e64a7848%7C72f988bf86f141af91ab2d7c
> > d011db47%7C1%7C0%7C637231080235723408&sdata=256CyAUusLIf8IetQfyd
> > 3KNAlIqVwV8uvjuPc6daP14%3D&reserved=0
> > Unsurprisingly, I strongly agree. ;)
> >
> I don't see why claiming that pattern-init is bad helps make the case 
> for zero-init.

Perhaps I did not express it well enough, but both have meaningful and important uses. My goal here is to illustrate how zero-init is being used (or preferred) in many real situations, as an argument for why it should not be hidden behind what some might see as a scary enable flag.

> > Apple
> >
> > I can't speak meaningfully here, but I've heard rumors that they are 
> > depending on zero-init as well. Perhaps someone there can clarify 
> > how they are using these features?
> >
> There's a difference between "depending on zero-init" (as in, the 
> group in question is okay with relying on implicit zeroing on code 
> reviews, etc.) and the use of zero-init as some sort of 
> defence-in-depth approach. Are these rumours clear as to which?

My understanding was the latter, but I hope to find out for real via this thread! :) It's not clear to me either.

> > So, while I understand the earlier objections to zero-init from a 
> > "language fork" concern, I think this isn't a position that can 
> > really stand up to the reality of how many projects are using the 
> > feature (even via non-Clang compilers). Given that so much code is 
> > going to be built using zero-init, what's the best way for Clang to adapt here?
> 
> It happens that there is zero-init and it's at least close enough to 
> what these projects want, but it is actually what they need?

Yes, it's expressly what is desired from a security perspective. (And quite to the relief of that same community, comes with the least performance impact, which is an unfortunately uncommon scenario in security flaw mitigations.) I tried to detail that earlier in my email where it's directly what is indicated as a meaningful defense against the long history of real-world "uninitialized" variable attacks: setting everything to zero is the best defense for the entire class of flaws.

--
Kees Cook