<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 4/21/20 3:11 PM, Richard Smith via

      cfe-dev wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAOfiQqng0WgkKdoa_oFsSyW3a5cqa_0gC6YAo9P1+3EZNsaOQQ@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">What you're proposing is, without question, a

        language extension. Our policy on language extensions is

        documented here: <a

          href="http://clang.llvm.org/get_involved.html"

          moz-do-not-send="true">http://clang.llvm.org/get_involved.html</a>

        <div><br>

        </div>

        <div>Right now, this fails at point 4. We do not want to create

          or encourage the creation of language dialects and

          non-portable code, so the place to have this discussion is in

          the C and C++ committees. Both committees have processes for

          specifying optional features these days, and they might be

          amenable to using those processes to standardize the behavior

          you're asking for. (I mean, maybe not, but our policy requires

          that you at least try.)</div>

        <div><br>

        </div>

        <div>However, there is a variant on what you're proposing that

          might fare better: instead of guaranteeing

          zero-initialization, we could guarantee that any observation

          of an uninitialized variable *either* gives produces zero or

          results in a trap. That is: it's still undefined to read from

          uninitialized variables -- we still do not guarantee what will

          happen if you do, and will warn on uninitialized uses and so

          on -- but we would bound the damage that can result from such

          accesses. You would get the security hardening benefits with

          the modest binary size impact. That approach would not

          introduce the risk of creating a language dialect (at least,

          not to the same extent), so our policy on avoiding language

          extensions would not apply.</div>

      </div>

    </blockquote>

    <p>Richard, just to check here, it sounds to me like you're raising

      more a point of specification then of implementation right?  That

      is, you're not stating that the actual implementation must

      sometimes trap (when producing a zero wouldn't), but that the

      specification of the flags and docs must leave the possibility

      there of?</p>

    <p>If I'm complete misinterpreting, please just say so.  I don't

      want to start a tangent discussion here, I just spotted what sound

      like it could be a "quick fix" which lets to OP achieve their

      objective and wanted to call it out if in fact I'd read correctly.</p>

    <p>Philip<br>

    </p>

    <blockquote type="cite"

cite="mid:CAOfiQqng0WgkKdoa_oFsSyW3a5cqa_0gC6YAo9P1+3EZNsaOQQ@mail.gmail.com"><br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Tue, 21 Apr 2020 at 14:21,

          Kees Cook via cfe-dev <<a

            href="mailto:cfe-dev@lists.llvm.org" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>

          <br>

          tl;dr: I'd like to revisit making -ftrivial-auto-var-init=zero

          an expressly<br>

          supported option. To do this, I think we need to either

          entirely remove<br>

"-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"<br>

          or rename it to something more directly reflecting the issue,

          like<br>

"-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".<br>

          <br>

          This is currently open as <a

            href="https://bugs.llvm.org/show_bug.cgi?id=45497"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=45497</a><br>

          <br>

          Here is the situation: -ftrivial-auto-var-init=pattern is

          great for<br>

          debugging, but -ftrivial-auto-var-init=zero is needed for

          production<br>

          systems for mainly two reasons, each of which I will try to

          express context<br>

          for:<br>

          <br>

          1) performance and size<br>

          <br>

          As measured by various Google folks across a few projects and

          in<br>

          various places, there's a fairly significant performance

          impact of<br>

          using pattern-init over zero-init. I can let other folks chime

          in<br>

          with their exact numbers, but I can at least share some

          measurements<br>

          Alexander Potapenko made with the Linux kernel (see

          "Performance costs"):<br>

          <a

href="https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf</a><br>

          tl;dr: zero-init tended to be half the cost of pattern-init,

          though it<br>

          varied based on workload, and binary size impact fell over 95%

          going<br>

          from pattern-init to zero-init.<br>

          <br>

          2) security<br>

          <br>

          Another driving factor (see below from various

          vendors/projects), is the<br>

          security stance. Putting non-zero values into most variables

          types ends<br>

          up making them arguably more dangerous than if they were

          zero-filled.<br>

          Most notably, sizes and indexes and less likely to be used out

          of bounds<br>

          if they are zero-initialized. The same holds for bool values

          that tend<br>

          to indicate success instead of failing safe with a false

          value. While<br>

          pointers in the non-canonical range are nice, zero tends to be

          just<br>

          as good. There are certainly exceptions here, but the bulk of

          the<br>

          historical record on how "uninitialized" variables have been

          used in<br>

          real world exploitation involve their being non-zero, and

          analysis of<br>

          those bugs support that conclusion.<br>

          <br>

          <br>

          Various positions from vendors and projects:<br>

          <br>

          Google (Android, Chrome OS)<br>

          <br>

          Both Android and Chrome OS initially started using

          pattern-init, but due<br>

          to each of: the performance characteristics, the binary size

          changes, and<br>

          the less robust security stance, both projects have recently

          committed<br>

          to switching to zero-init.<br>

          <br>

          <br>

          Microsoft (Windows)<br>

          <br>

          I'm repeating what Joe Bialek has told me, so he can clarify

          if I'm not<br>

          representing this correctly... While not using Clang/LLVM,

          Microsoft is<br>

          part of the larger C/C++ ecosystem and has implemented both

          zero-init<br>

          (for production builds) and pattern-init (for debug builds) in

          their<br>

          compiler too. They also chose zero-init for production

          expressly due<br>

          to the security benefits.<br>

          <br>

          Some details of their work:<br>

          <a

href="https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf</a><br>

          <br>

          <br>

          Upstream Linux kernel<br>

          <br>

          Linus Torvalds has directly stated that he wants zero-init:<br>

          "So I'd like the zeroing of local variables to be a native

          compiler<br>

          option..."<br>

          "This, btw, is why I also think that the "initialize with

          poison" is<br>

          pointless and wrong."<br>

          <a

href="https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/</a><br>

          Unsurprisingly, I strongly agree. ;)<br>

          <br>

          <br>

          GrapheneOS is using zero-init (rather than patching Clang as

          it used to, to get<br>

          the same result):<br>

          <a

            href="https://twitter.com/DanielMicay/status/1248384468181643272"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://twitter.com/DanielMicay/status/1248384468181643272</a><br>

          <br>

          <br>

          GCC<br>

          There's been mostly silence on the entire topic of automatic

          variable<br>

          initialization, though there have been patches proposed in the

          past for<br>

          zero-init:<br>

          <a

            href="https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html</a><br>

          <br>

          <br>

          Apple<br>

          <br>

          I can't speak meaningfully here, but I've heard rumors that

          they are<br>

          depending on zero-init as well. Perhaps someone there can

          clarify how<br>

          they are using these features?<br>

          <br>

          <br>

          <br>

          So, while I understand the earlier objections to zero-init

          from a<br>

          "language fork" concern, I think this isn't a position that

          can really<br>

          stand up to the reality of how many projects are using the

          feature (even<br>

          via non-Clang compilers). Given that so much code is going to

          be built<br>

          using zero-init, what's the best way for Clang to adapt here?

          I would<br>

          prefer to just drop the -enable... option entirely, but I

          think just<br>

          renaming it would be fine too.<br>

          <br>

          Thoughts/flames? ;)<br>

          <br>

          -- <br>

          Kees Cook<br>

          _______________________________________________<br>

          cfe-dev mailing list<br>

          <a href="mailto:cfe-dev@lists.llvm.org" target="_blank"

            moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>

          <a

            href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

        </blockquote>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

cfe-dev mailing list

<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>

</pre>

    </blockquote>

  </body>

</html>