[cfe-dev] making -ftrivial-auto-var-init=zero a first-class option

JF Bastien via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 21 20:26:46 PDT 2020



> On Apr 21, 2020, at 3:29 PM, Hubert Tong via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> On Tue, Apr 21, 2020 at 5:20 PM Kees Cook via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
> Hi,
> 
> tl;dr: I'd like to revisit making -ftrivial-auto-var-init=zero an expressly
> supported option. To do this, I think we need to either entirely remove
> "-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"
> or rename it to something more directly reflecting the issue, like
> "-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".
> 
> This is currently open as https://bugs.llvm.org/show_bug.cgi?id=45497 <https://bugs.llvm.org/show_bug.cgi?id=45497>
> 
> Here is the situation: -ftrivial-auto-var-init=pattern is great for
> debugging, but -ftrivial-auto-var-init=zero is needed for production
> systems for mainly two reasons, each of which I will try to express context
> for:
> 
> 1) performance and size
> 
> As measured by various Google folks across a few projects and in
> various places, there's a fairly significant performance impact of
> using pattern-init over zero-init. I can let other folks chime in
> with their exact numbers, but I can at least share some measurements
> Alexander Potapenko made with the Linux kernel (see "Performance costs"):
> https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf <https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf>
> tl;dr: zero-init tended to be half the cost of pattern-init, though it
> varied based on workload, and binary size impact fell over 95% going
> from pattern-init to zero-init.
> This does not seem to indicate why zero-init is preferred over a default of using no explicit policy in production.
>  
> 
> 2) security
> 
> Another driving factor (see below from various vendors/projects), is the
> security stance. Putting non-zero values into most variables types ends
> up making them arguably more dangerous than if they were zero-filled.
> Most notably, sizes and indexes and less likely to be used out of bounds
> if they are zero-initialized. The same holds for bool values that tend
> to indicate success instead of failing safe with a false value. While
> pointers in the non-canonical range are nice, zero tends to be just
> as good. There are certainly exceptions here, but the bulk of the
> historical record on how "uninitialized" variables have been used in
> Maybe an explanation of the scare quotes around "uninitialized" would help clarify your position.
>  
> real world exploitation involve their being non-zero, and analysis of
> those bugs support that conclusion.
> 
> 
> Various positions from vendors and projects:
> 
> Google (Android, Chrome OS)
> 
> Both Android and Chrome OS initially started using pattern-init, but due
> to each of: the performance characteristics, the binary size changes, and
> the less robust security stance, both projects have recently committed
> to switching to zero-init.
> I'm not sure that this is clear in terms of whether the statements apply to debug/development or production. I don't think pattern-init is meant to be a tool for production builds, which leads me to think that the above statement is about debug builds, at which point I'm thinking that using zero-init only serves to hide problems.

The entire feature (including pattern init) is exactly designed to be a tool for production builds. It is in production at Apple as well as Google (not just their servers, but their devices as well), and I’ve heard the same (privately) from many others, bit and small. I covered why in my LLVM dev meeting talk: https://www.youtube.com/watch?v=I-XUHPimq3o <https://www.youtube.com/watch?v=I-XUHPimq3o>

I’ve also written this out in details in prior discussions. I can repeat here if needed, but I’m not sure more text is helpful at the moment :-)


> Microsoft (Windows)
> 
> I'm repeating what Joe Bialek has told me, so he can clarify if I'm not
> representing this correctly... While not using Clang/LLVM, Microsoft is
> part of the larger C/C++ ecosystem and has implemented both zero-init
> (for production builds) and pattern-init (for debug builds) in their
> compiler too. They also chose zero-init for production expressly due
> to the security benefits.
> 
> Some details of their work:
> https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf <https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf>
> 
> 
> Upstream Linux kernel
> 
> Linus Torvalds has directly stated that he wants zero-init:
> "So I'd like the zeroing of local variables to be a native compiler
> option..."
> "This, btw, is why I also think that the "initialize with poison" is
> pointless and wrong."
> https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/ <https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/>
> Unsurprisingly, I strongly agree. ;)
> I don't see why claiming that pattern-init is bad helps make the case for zero-init.
>  
> 
> 
> GrapheneOS is using zero-init (rather than patching Clang as it used to, to get
> the same result):
> https://twitter.com/DanielMicay/status/1248384468181643272 <https://twitter.com/DanielMicay/status/1248384468181643272>
> 
> 
> GCC
> There's been mostly silence on the entire topic of automatic variable
> initialization, though there have been patches proposed in the past for
> zero-init:
> https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html <https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html>
> 
> 
> Apple
> 
> I can't speak meaningfully here, but I've heard rumors that they are
> depending on zero-init as well. Perhaps someone there can clarify how
> they are using these features?
> There's a difference between "depending on zero-init" (as in, the group in question is okay with relying on implicit zeroing on code reviews, etc.) and the use of zero-init as some sort of defence-in-depth approach. Are these rumours clear as to which?
>  
> 
> 
> 
> So, while I understand the earlier objections to zero-init from a
> "language fork" concern, I think this isn't a position that can really
> stand up to the reality of how many projects are using the feature (even
> via non-Clang compilers). Given that so much code is going to be built
> using zero-init, what's the best way for Clang to adapt here?
> It happens that there is zero-init and it's at least close enough to what these projects want, but it is actually what they need?
>  
> I would
> prefer to just drop the -enable... option entirely, but I think just
> renaming it would be fine too.
> 
> Thoughts/flames? ;)
> 
> -- 
> Kees Cook
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200421/bc929a20/attachment-0001.html>


More information about the cfe-dev mailing list