[cfe-dev] making -ftrivial-auto-var-init=zero a first-class option

Philip Reames via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 21 16:48:34 PDT 2020


On 4/21/20 3:11 PM, Richard Smith via cfe-dev wrote:
> What you're proposing is, without question, a language extension. Our 
> policy on language extensions is documented here: 
> http://clang.llvm.org/get_involved.html
>
> Right now, this fails at point 4. We do not want to create or 
> encourage the creation of language dialects and non-portable code, so 
> the place to have this discussion is in the C and C++ committees. Both 
> committees have processes for specifying optional features these days, 
> and they might be amenable to using those processes to standardize the 
> behavior you're asking for. (I mean, maybe not, but our policy 
> requires that you at least try.)
>
> However, there is a variant on what you're proposing that might fare 
> better: instead of guaranteeing zero-initialization, we could 
> guarantee that any observation of an uninitialized variable *either* 
> gives produces zero or results in a trap. That is: it's still 
> undefined to read from uninitialized variables -- we still do not 
> guarantee what will happen if you do, and will warn on uninitialized 
> uses and so on -- but we would bound the damage that can result from 
> such accesses. You would get the security hardening benefits with the 
> modest binary size impact. That approach would not introduce the risk 
> of creating a language dialect (at least, not to the same extent), so 
> our policy on avoiding language extensions would not apply.

Richard, just to check here, it sounds to me like you're raising more a 
point of specification then of implementation right?  That is, you're 
not stating that the actual implementation must sometimes trap (when 
producing a zero wouldn't), but that the specification of the flags and 
docs must leave the possibility there of?

If I'm complete misinterpreting, please just say so.  I don't want to 
start a tangent discussion here, I just spotted what sound like it could 
be a "quick fix" which lets to OP achieve their objective and wanted to 
call it out if in fact I'd read correctly.

Philip

>
> On Tue, 21 Apr 2020 at 14:21, Kees Cook via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     Hi,
>
>     tl;dr: I'd like to revisit making -ftrivial-auto-var-init=zero an
>     expressly
>     supported option. To do this, I think we need to either entirely
>     remove
>     "-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"
>     or rename it to something more directly reflecting the issue, like
>     "-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".
>
>     This is currently open as https://bugs.llvm.org/show_bug.cgi?id=45497
>
>     Here is the situation: -ftrivial-auto-var-init=pattern is great for
>     debugging, but -ftrivial-auto-var-init=zero is needed for production
>     systems for mainly two reasons, each of which I will try to
>     express context
>     for:
>
>     1) performance and size
>
>     As measured by various Google folks across a few projects and in
>     various places, there's a fairly significant performance impact of
>     using pattern-init over zero-init. I can let other folks chime in
>     with their exact numbers, but I can at least share some measurements
>     Alexander Potapenko made with the Linux kernel (see "Performance
>     costs"):
>     https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf
>     tl;dr: zero-init tended to be half the cost of pattern-init, though it
>     varied based on workload, and binary size impact fell over 95% going
>     from pattern-init to zero-init.
>
>     2) security
>
>     Another driving factor (see below from various vendors/projects),
>     is the
>     security stance. Putting non-zero values into most variables types
>     ends
>     up making them arguably more dangerous than if they were zero-filled.
>     Most notably, sizes and indexes and less likely to be used out of
>     bounds
>     if they are zero-initialized. The same holds for bool values that tend
>     to indicate success instead of failing safe with a false value. While
>     pointers in the non-canonical range are nice, zero tends to be just
>     as good. There are certainly exceptions here, but the bulk of the
>     historical record on how "uninitialized" variables have been used in
>     real world exploitation involve their being non-zero, and analysis of
>     those bugs support that conclusion.
>
>
>     Various positions from vendors and projects:
>
>     Google (Android, Chrome OS)
>
>     Both Android and Chrome OS initially started using pattern-init,
>     but due
>     to each of: the performance characteristics, the binary size
>     changes, and
>     the less robust security stance, both projects have recently committed
>     to switching to zero-init.
>
>
>     Microsoft (Windows)
>
>     I'm repeating what Joe Bialek has told me, so he can clarify if
>     I'm not
>     representing this correctly... While not using Clang/LLVM,
>     Microsoft is
>     part of the larger C/C++ ecosystem and has implemented both zero-init
>     (for production builds) and pattern-init (for debug builds) in their
>     compiler too. They also chose zero-init for production expressly due
>     to the security benefits.
>
>     Some details of their work:
>     https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf
>
>
>     Upstream Linux kernel
>
>     Linus Torvalds has directly stated that he wants zero-init:
>     "So I'd like the zeroing of local variables to be a native compiler
>     option..."
>     "This, btw, is why I also think that the "initialize with poison" is
>     pointless and wrong."
>     https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/
>     Unsurprisingly, I strongly agree. ;)
>
>
>     GrapheneOS is using zero-init (rather than patching Clang as it
>     used to, to get
>     the same result):
>     https://twitter.com/DanielMicay/status/1248384468181643272
>
>
>     GCC
>     There's been mostly silence on the entire topic of automatic variable
>     initialization, though there have been patches proposed in the
>     past for
>     zero-init:
>     https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html
>
>
>     Apple
>
>     I can't speak meaningfully here, but I've heard rumors that they are
>     depending on zero-init as well. Perhaps someone there can clarify how
>     they are using these features?
>
>
>
>     So, while I understand the earlier objections to zero-init from a
>     "language fork" concern, I think this isn't a position that can really
>     stand up to the reality of how many projects are using the feature
>     (even
>     via non-Clang compilers). Given that so much code is going to be built
>     using zero-init, what's the best way for Clang to adapt here? I would
>     prefer to just drop the -enable... option entirely, but I think just
>     renaming it would be fine too.
>
>     Thoughts/flames? ;)
>
>     -- 
>     Kees Cook
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200421/b00c2241/attachment-0001.html>


More information about the cfe-dev mailing list