<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 22, 2020 at 9:54 AM Philip Reames via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div>

    <p><br>

    </p>

    <div>On 4/21/20 4:59 PM, Richard Smith

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div dir="ltr">On Tue, 21 Apr 2020 at 16:49, Philip Reames via

          cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br>

        </div>

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div>

              <p>On 4/21/20 3:11 PM, Richard Smith via cfe-dev wrote:<br>

              </p>

              <blockquote type="cite">

                <div dir="ltr">What you're proposing is, without

                  question, a language extension. Our policy on language

                  extensions is documented here: <a href="http://clang.llvm.org/get_involved.html" target="_blank">http://clang.llvm.org/get_involved.html</a>

                  <div><br>

                  </div>

                  <div>Right now, this fails at point 4. We do not want

                    to create or encourage the creation of language

                    dialects and non-portable code, so the place to have

                    this discussion is in the C and C++ committees. Both

                    committees have processes for specifying optional

                    features these days, and they might be amenable to

                    using those processes to standardize the behavior

                    you're asking for. (I mean, maybe not, but our

                    policy requires that you at least try.)</div>

                  <div><br>

                  </div>

                  <div>However, there is a variant on what you're

                    proposing that might fare better: instead of

                    guaranteeing zero-initialization, we could guarantee

                    that any observation of an uninitialized variable

                    *either* gives produces zero or results in a trap.

                    That is: it's still undefined to read from

                    uninitialized variables -- we still do not guarantee

                    what will happen if you do, and will warn on

                    uninitialized uses and so on -- but we would bound

                    the damage that can result from such accesses. You

                    would get the security hardening benefits with the

                    modest binary size impact. That approach would not

                    introduce the risk of creating a language dialect

                    (at least, not to the same extent), so our policy on

                    avoiding language extensions would not apply.</div>

                </div>

              </blockquote>

              <p>Richard, just to check here, it sounds to me like

                you're raising more a point of specification then of

                implementation right?  That is, you're not stating that

                the actual implementation must sometimes trap (when

                producing a zero wouldn't), but that the specification

                of the flags and docs must leave the possibility there

                of?</p>

            </div>

          </blockquote>

          <div>Well, I think it's not sufficient to merely say that we

            might do something like trap, if our intent is that we never

            will. We would need to reasonably agree that (for example)

            if someone came forward with a patch that actually

            implemented said trapping behavior and didn't introduce any

            significant code size or performance impact, that we would

            consider such a change to be a quality of implementation

            improvement. But I don't think we need anyone to have

            actually committed themselves to producing such a patch, or

            any timeline or expectation of when (or indeed whether) it

            would be done. Sorry if this is splitting a hair, but I

            think it's an important hair to split. <br>

          </div>

        </div>

      </div>

    </blockquote>

    Hair successfully split.  I agree it is a key distinction.<br></div></blockquote><div><br>Personally, I find that a bit too fine (but wouldn't stand in the way of the decision) & would prefer at least a rough/most basic trapping behavior for that, so really obvious intentional use of zero init would fail - making it hard for anyone to develop a coding convention/style around it, etc. Wouldn't need to be fancy at all, that being the point - make it as easy/unintrusive to implement, and catch the most blatant uses of zero init, so that if someone tried to write code against a zero-init language fork, their most obvious/common code would fail and they'd have sufficient contortions that it'd be hard to argue it was an intentional/consistent way to write code. "well, we explicitly zero init /these/ simple cases, but rely on compiler-derived zero init in the more complicated cases where we can (currently) get away with it... " </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div>

              <p>If I'm complete misinterpreting, please just say so.  I

                don't want to start a tangent discussion here, I just

                spotted what sound like it could be a "quick fix" which

                lets to OP achieve their objective and wanted to call it

                out if in fact I'd read correctly.</p>

              <p>Philip<br>

              </p>

              <blockquote type="cite"><br>

                <div class="gmail_quote">

                  <div dir="ltr" class="gmail_attr">On Tue, 21 Apr 2020

                    at 14:21, Kees Cook via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>>

                    wrote:<br>

                  </div>

                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>

                    <br>

                    tl;dr: I'd like to revisit making

                    -ftrivial-auto-var-init=zero an expressly<br>

                    supported option. To do this, I think we need to

                    either entirely remove<br>

"-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"<br>

                    or rename it to something more directly reflecting

                    the issue, like<br>

"-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".<br>

                    <br>

                    This is currently open as <a href="https://bugs.llvm.org/show_bug.cgi?id=45497" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=45497</a><br>

                    <br>

                    Here is the situation:

                    -ftrivial-auto-var-init=pattern is great for<br>

                    debugging, but -ftrivial-auto-var-init=zero is

                    needed for production<br>

                    systems for mainly two reasons, each of which I will

                    try to express context<br>

                    for:<br>

                    <br>

                    1) performance and size<br>

                    <br>

                    As measured by various Google folks across a few

                    projects and in<br>

                    various places, there's a fairly significant

                    performance impact of<br>

                    using pattern-init over zero-init. I can let other

                    folks chime in<br>

                    with their exact numbers, but I can at least share

                    some measurements<br>

                    Alexander Potapenko made with the Linux kernel (see

                    "Performance costs"):<br>

                    <a href="https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf" rel="noreferrer" target="_blank">https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf</a><br>

                    tl;dr: zero-init tended to be half the cost of

                    pattern-init, though it<br>

                    varied based on workload, and binary size impact

                    fell over 95% going<br>

                    from pattern-init to zero-init.<br>

                    <br>

                    2) security<br>

                    <br>

                    Another driving factor (see below from various

                    vendors/projects), is the<br>

                    security stance. Putting non-zero values into most

                    variables types ends<br>

                    up making them arguably more dangerous than if they

                    were zero-filled.<br>

                    Most notably, sizes and indexes and less likely to

                    be used out of bounds<br>

                    if they are zero-initialized. The same holds for

                    bool values that tend<br>

                    to indicate success instead of failing safe with a

                    false value. While<br>

                    pointers in the non-canonical range are nice, zero

                    tends to be just<br>

                    as good. There are certainly exceptions here, but

                    the bulk of the<br>

                    historical record on how "uninitialized" variables

                    have been used in<br>

                    real world exploitation involve their being

                    non-zero, and analysis of<br>

                    those bugs support that conclusion.<br>

                    <br>

                    <br>

                    Various positions from vendors and projects:<br>

                    <br>

                    Google (Android, Chrome OS)<br>

                    <br>

                    Both Android and Chrome OS initially started using

                    pattern-init, but due<br>

                    to each of: the performance characteristics, the

                    binary size changes, and<br>

                    the less robust security stance, both projects have

                    recently committed<br>

                    to switching to zero-init.<br>

                    <br>

                    <br>

                    Microsoft (Windows)<br>

                    <br>

                    I'm repeating what Joe Bialek has told me, so he can

                    clarify if I'm not<br>

                    representing this correctly... While not using

                    Clang/LLVM, Microsoft is<br>

                    part of the larger C/C++ ecosystem and has

                    implemented both zero-init<br>

                    (for production builds) and pattern-init (for debug

                    builds) in their<br>

                    compiler too. They also chose zero-init for

                    production expressly due<br>

                    to the security benefits.<br>

                    <br>

                    Some details of their work:<br>

                    <a href="https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf" rel="noreferrer" target="_blank">https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf</a><br>

                    <br>

                    <br>

                    Upstream Linux kernel<br>

                    <br>

                    Linus Torvalds has directly stated that he wants

                    zero-init:<br>

                    "So I'd like the zeroing of local variables to be a

                    native compiler<br>

                    option..."<br>

                    "This, btw, is why I also think that the "initialize

                    with poison" is<br>

                    pointless and wrong."<br>

                    <a href="https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/</a><br>

                    Unsurprisingly, I strongly agree. ;)<br>

                    <br>

                    <br>

                    GrapheneOS is using zero-init (rather than patching

                    Clang as it used to, to get<br>

                    the same result):<br>

                    <a href="https://twitter.com/DanielMicay/status/1248384468181643272" rel="noreferrer" target="_blank">https://twitter.com/DanielMicay/status/1248384468181643272</a><br>

                    <br>

                    <br>

                    GCC<br>

                    There's been mostly silence on the entire topic of

                    automatic variable<br>

                    initialization, though there have been patches

                    proposed in the past for<br>

                    zero-init:<br>

                    <a href="https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html" rel="noreferrer" target="_blank">https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html</a><br>

                    <br>

                    <br>

                    Apple<br>

                    <br>

                    I can't speak meaningfully here, but I've heard

                    rumors that they are<br>

                    depending on zero-init as well. Perhaps someone

                    there can clarify how<br>

                    they are using these features?<br>

                    <br>

                    <br>

                    <br>

                    So, while I understand the earlier objections to

                    zero-init from a<br>

                    "language fork" concern, I think this isn't a

                    position that can really<br>

                    stand up to the reality of how many projects are

                    using the feature (even<br>

                    via non-Clang compilers). Given that so much code is

                    going to be built<br>

                    using zero-init, what's the best way for Clang to

                    adapt here? I would<br>

                    prefer to just drop the -enable... option entirely,

                    but I think just<br>

                    renaming it would be fine too.<br>

                    <br>

                    Thoughts/flames? ;)<br>

                    <br>

                    -- <br>

                    Kees Cook<br>

                    _______________________________________________<br>

                    cfe-dev mailing list<br>

                    <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

                    <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

                  </blockquote>

                </div>

                <br>

                <fieldset></fieldset>

                <pre>_______________________________________________

cfe-dev mailing list

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>

</pre>

              </blockquote>

            </div>

            _______________________________________________<br>

            cfe-dev mailing list<br>

            <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

            <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

          </blockquote>

        </div>

      </div>

    </blockquote>

  </div>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div></div>