<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 4/21/20 3:11 PM, Richard Smith via
      cfe-dev wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAOfiQqng0WgkKdoa_oFsSyW3a5cqa_0gC6YAo9P1+3EZNsaOQQ@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">What you're proposing is, without question, a
        language extension. Our policy on language extensions is
        documented here: <a
          href="http://clang.llvm.org/get_involved.html"
          moz-do-not-send="true">http://clang.llvm.org/get_involved.html</a>
        <div><br>
        </div>
        <div>Right now, this fails at point 4. We do not want to create
          or encourage the creation of language dialects and
          non-portable code, so the place to have this discussion is in
          the C and C++ committees. Both committees have processes for
          specifying optional features these days, and they might be
          amenable to using those processes to standardize the behavior
          you're asking for. (I mean, maybe not, but our policy requires
          that you at least try.)</div>
        <div><br>
        </div>
        <div>However, there is a variant on what you're proposing that
          might fare better: instead of guaranteeing
          zero-initialization, we could guarantee that any observation
          of an uninitialized variable *either* gives produces zero or
          results in a trap. That is: it's still undefined to read from
          uninitialized variables -- we still do not guarantee what will
          happen if you do, and will warn on uninitialized uses and so
          on -- but we would bound the damage that can result from such
          accesses. You would get the security hardening benefits with
          the modest binary size impact. That approach would not
          introduce the risk of creating a language dialect (at least,
          not to the same extent), so our policy on avoiding language
          extensions would not apply.</div>
      </div>
    </blockquote>
    <p>Richard, just to check here, it sounds to me like you're raising
      more a point of specification then of implementation right?  That
      is, you're not stating that the actual implementation must
      sometimes trap (when producing a zero wouldn't), but that the
      specification of the flags and docs must leave the possibility
      there of?</p>
    <p>If I'm complete misinterpreting, please just say so.  I don't
      want to start a tangent discussion here, I just spotted what sound
      like it could be a "quick fix" which lets to OP achieve their
      objective and wanted to call it out if in fact I'd read correctly.</p>
    <p>Philip<br>
    </p>
    <blockquote type="cite"
cite="mid:CAOfiQqng0WgkKdoa_oFsSyW3a5cqa_0gC6YAo9P1+3EZNsaOQQ@mail.gmail.com"><br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Tue, 21 Apr 2020 at 14:21,
          Kees Cook via cfe-dev <<a
            href="mailto:cfe-dev@lists.llvm.org" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
          <br>
          tl;dr: I'd like to revisit making -ftrivial-auto-var-init=zero
          an expressly<br>
          supported option. To do this, I think we need to either
          entirely remove<br>
"-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"<br>
          or rename it to something more directly reflecting the issue,
          like<br>
"-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".<br>
          <br>
          This is currently open as <a
            href="https://bugs.llvm.org/show_bug.cgi?id=45497"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=45497</a><br>
          <br>
          Here is the situation: -ftrivial-auto-var-init=pattern is
          great for<br>
          debugging, but -ftrivial-auto-var-init=zero is needed for
          production<br>
          systems for mainly two reasons, each of which I will try to
          express context<br>
          for:<br>
          <br>
          1) performance and size<br>
          <br>
          As measured by various Google folks across a few projects and
          in<br>
          various places, there's a fairly significant performance
          impact of<br>
          using pattern-init over zero-init. I can let other folks chime
          in<br>
          with their exact numbers, but I can at least share some
          measurements<br>
          Alexander Potapenko made with the Linux kernel (see
          "Performance costs"):<br>
          <a
href="https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf</a><br>
          tl;dr: zero-init tended to be half the cost of pattern-init,
          though it<br>
          varied based on workload, and binary size impact fell over 95%
          going<br>
          from pattern-init to zero-init.<br>
          <br>
          2) security<br>
          <br>
          Another driving factor (see below from various
          vendors/projects), is the<br>
          security stance. Putting non-zero values into most variables
          types ends<br>
          up making them arguably more dangerous than if they were
          zero-filled.<br>
          Most notably, sizes and indexes and less likely to be used out
          of bounds<br>
          if they are zero-initialized. The same holds for bool values
          that tend<br>
          to indicate success instead of failing safe with a false
          value. While<br>
          pointers in the non-canonical range are nice, zero tends to be
          just<br>
          as good. There are certainly exceptions here, but the bulk of
          the<br>
          historical record on how "uninitialized" variables have been
          used in<br>
          real world exploitation involve their being non-zero, and
          analysis of<br>
          those bugs support that conclusion.<br>
          <br>
          <br>
          Various positions from vendors and projects:<br>
          <br>
          Google (Android, Chrome OS)<br>
          <br>
          Both Android and Chrome OS initially started using
          pattern-init, but due<br>
          to each of: the performance characteristics, the binary size
          changes, and<br>
          the less robust security stance, both projects have recently
          committed<br>
          to switching to zero-init.<br>
          <br>
          <br>
          Microsoft (Windows)<br>
          <br>
          I'm repeating what Joe Bialek has told me, so he can clarify
          if I'm not<br>
          representing this correctly... While not using Clang/LLVM,
          Microsoft is<br>
          part of the larger C/C++ ecosystem and has implemented both
          zero-init<br>
          (for production builds) and pattern-init (for debug builds) in
          their<br>
          compiler too. They also chose zero-init for production
          expressly due<br>
          to the security benefits.<br>
          <br>
          Some details of their work:<br>
          <a
href="https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf</a><br>
          <br>
          <br>
          Upstream Linux kernel<br>
          <br>
          Linus Torvalds has directly stated that he wants zero-init:<br>
          "So I'd like the zeroing of local variables to be a native
          compiler<br>
          option..."<br>
          "This, btw, is why I also think that the "initialize with
          poison" is<br>
          pointless and wrong."<br>
          <a
href="https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/</a><br>
          Unsurprisingly, I strongly agree. ;)<br>
          <br>
          <br>
          GrapheneOS is using zero-init (rather than patching Clang as
          it used to, to get<br>
          the same result):<br>
          <a
            href="https://twitter.com/DanielMicay/status/1248384468181643272"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://twitter.com/DanielMicay/status/1248384468181643272</a><br>
          <br>
          <br>
          GCC<br>
          There's been mostly silence on the entire topic of automatic
          variable<br>
          initialization, though there have been patches proposed in the
          past for<br>
          zero-init:<br>
          <a
            href="https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html</a><br>
          <br>
          <br>
          Apple<br>
          <br>
          I can't speak meaningfully here, but I've heard rumors that
          they are<br>
          depending on zero-init as well. Perhaps someone there can
          clarify how<br>
          they are using these features?<br>
          <br>
          <br>
          <br>
          So, while I understand the earlier objections to zero-init
          from a<br>
          "language fork" concern, I think this isn't a position that
          can really<br>
          stand up to the reality of how many projects are using the
          feature (even<br>
          via non-Clang compilers). Given that so much code is going to
          be built<br>
          using zero-init, what's the best way for Clang to adapt here?
          I would<br>
          prefer to just drop the -enable... option entirely, but I
          think just<br>
          renaming it would be fine too.<br>
          <br>
          Thoughts/flames? ;)<br>
          <br>
          -- <br>
          Kees Cook<br>
          _______________________________________________<br>
          cfe-dev mailing list<br>
          <a href="mailto:cfe-dev@lists.llvm.org" target="_blank"
            moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>
          <a
            href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
        </blockquote>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
cfe-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
    </blockquote>
  </body>
</html>