<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 22, 2020 at 9:54 AM Philip Reames via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <p><br>
    </p>
    <div>On 4/21/20 4:59 PM, Richard Smith
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div dir="ltr">On Tue, 21 Apr 2020 at 16:49, Philip Reames via
          cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br>
        </div>
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div>
              <p>On 4/21/20 3:11 PM, Richard Smith via cfe-dev wrote:<br>
              </p>
              <blockquote type="cite">
                <div dir="ltr">What you're proposing is, without
                  question, a language extension. Our policy on language
                  extensions is documented here: <a href="http://clang.llvm.org/get_involved.html" target="_blank">http://clang.llvm.org/get_involved.html</a>
                  <div><br>
                  </div>
                  <div>Right now, this fails at point 4. We do not want
                    to create or encourage the creation of language
                    dialects and non-portable code, so the place to have
                    this discussion is in the C and C++ committees. Both
                    committees have processes for specifying optional
                    features these days, and they might be amenable to
                    using those processes to standardize the behavior
                    you're asking for. (I mean, maybe not, but our
                    policy requires that you at least try.)</div>
                  <div><br>
                  </div>
                  <div>However, there is a variant on what you're
                    proposing that might fare better: instead of
                    guaranteeing zero-initialization, we could guarantee
                    that any observation of an uninitialized variable
                    *either* gives produces zero or results in a trap.
                    That is: it's still undefined to read from
                    uninitialized variables -- we still do not guarantee
                    what will happen if you do, and will warn on
                    uninitialized uses and so on -- but we would bound
                    the damage that can result from such accesses. You
                    would get the security hardening benefits with the
                    modest binary size impact. That approach would not
                    introduce the risk of creating a language dialect
                    (at least, not to the same extent), so our policy on
                    avoiding language extensions would not apply.</div>
                </div>
              </blockquote>
              <p>Richard, just to check here, it sounds to me like
                you're raising more a point of specification then of
                implementation right?  That is, you're not stating that
                the actual implementation must sometimes trap (when
                producing a zero wouldn't), but that the specification
                of the flags and docs must leave the possibility there
                of?</p>
            </div>
          </blockquote>
          <div>Well, I think it's not sufficient to merely say that we
            might do something like trap, if our intent is that we never
            will. We would need to reasonably agree that (for example)
            if someone came forward with a patch that actually
            implemented said trapping behavior and didn't introduce any
            significant code size or performance impact, that we would
            consider such a change to be a quality of implementation
            improvement. But I don't think we need anyone to have
            actually committed themselves to producing such a patch, or
            any timeline or expectation of when (or indeed whether) it
            would be done. Sorry if this is splitting a hair, but I
            think it's an important hair to split. <br>
          </div>
        </div>
      </div>
    </blockquote>
    Hair successfully split.  I agree it is a key distinction.<br></div></blockquote><div><br>Personally, I find that a bit too fine (but wouldn't stand in the way of the decision) & would prefer at least a rough/most basic trapping behavior for that, so really obvious intentional use of zero init would fail - making it hard for anyone to develop a coding convention/style around it, etc. Wouldn't need to be fancy at all, that being the point - make it as easy/unintrusive to implement, and catch the most blatant uses of zero init, so that if someone tried to write code against a zero-init language fork, their most obvious/common code would fail and they'd have sufficient contortions that it'd be hard to argue it was an intentional/consistent way to write code. "well, we explicitly zero init /these/ simple cases, but rely on compiler-derived zero init in the more complicated cases where we can (currently) get away with it... " </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div>
              <p>If I'm complete misinterpreting, please just say so.  I
                don't want to start a tangent discussion here, I just
                spotted what sound like it could be a "quick fix" which
                lets to OP achieve their objective and wanted to call it
                out if in fact I'd read correctly.</p>
              <p>Philip<br>
              </p>
              <blockquote type="cite"><br>
                <div class="gmail_quote">
                  <div dir="ltr" class="gmail_attr">On Tue, 21 Apr 2020
                    at 14:21, Kees Cook via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>>
                    wrote:<br>
                  </div>
                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
                    <br>
                    tl;dr: I'd like to revisit making
                    -ftrivial-auto-var-init=zero an expressly<br>
                    supported option. To do this, I think we need to
                    either entirely remove<br>
"-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang"<br>
                    or rename it to something more directly reflecting
                    the issue, like<br>
"-enable-trivial-auto-var-init-zero-knowing-it-forks-the-language".<br>
                    <br>
                    This is currently open as <a href="https://bugs.llvm.org/show_bug.cgi?id=45497" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=45497</a><br>
                    <br>
                    Here is the situation:
                    -ftrivial-auto-var-init=pattern is great for<br>
                    debugging, but -ftrivial-auto-var-init=zero is
                    needed for production<br>
                    systems for mainly two reasons, each of which I will
                    try to express context<br>
                    for:<br>
                    <br>
                    1) performance and size<br>
                    <br>
                    As measured by various Google folks across a few
                    projects and in<br>
                    various places, there's a fairly significant
                    performance impact of<br>
                    using pattern-init over zero-init. I can let other
                    folks chime in<br>
                    with their exact numbers, but I can at least share
                    some measurements<br>
                    Alexander Potapenko made with the Linux kernel (see
                    "Performance costs"):<br>
                    <a href="https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf" rel="noreferrer" target="_blank">https://clangbuiltlinux.github.io/CBL-meetup-2020-slides/glider/Fighting_uninitialized_memory_%40_CBL_Meetup_2020.pdf</a><br>
                    tl;dr: zero-init tended to be half the cost of
                    pattern-init, though it<br>
                    varied based on workload, and binary size impact
                    fell over 95% going<br>
                    from pattern-init to zero-init.<br>
                    <br>
                    2) security<br>
                    <br>
                    Another driving factor (see below from various
                    vendors/projects), is the<br>
                    security stance. Putting non-zero values into most
                    variables types ends<br>
                    up making them arguably more dangerous than if they
                    were zero-filled.<br>
                    Most notably, sizes and indexes and less likely to
                    be used out of bounds<br>
                    if they are zero-initialized. The same holds for
                    bool values that tend<br>
                    to indicate success instead of failing safe with a
                    false value. While<br>
                    pointers in the non-canonical range are nice, zero
                    tends to be just<br>
                    as good. There are certainly exceptions here, but
                    the bulk of the<br>
                    historical record on how "uninitialized" variables
                    have been used in<br>
                    real world exploitation involve their being
                    non-zero, and analysis of<br>
                    those bugs support that conclusion.<br>
                    <br>
                    <br>
                    Various positions from vendors and projects:<br>
                    <br>
                    Google (Android, Chrome OS)<br>
                    <br>
                    Both Android and Chrome OS initially started using
                    pattern-init, but due<br>
                    to each of: the performance characteristics, the
                    binary size changes, and<br>
                    the less robust security stance, both projects have
                    recently committed<br>
                    to switching to zero-init.<br>
                    <br>
                    <br>
                    Microsoft (Windows)<br>
                    <br>
                    I'm repeating what Joe Bialek has told me, so he can
                    clarify if I'm not<br>
                    representing this correctly... While not using
                    Clang/LLVM, Microsoft is<br>
                    part of the larger C/C++ ecosystem and has
                    implemented both zero-init<br>
                    (for production builds) and pattern-init (for debug
                    builds) in their<br>
                    compiler too. They also chose zero-init for
                    production expressly due<br>
                    to the security benefits.<br>
                    <br>
                    Some details of their work:<br>
                    <a href="https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf" rel="noreferrer" target="_blank">https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2019_09_CppCon/CppCon2019%20-%20Killing%20Uninitialized%20Memory.pdf</a><br>
                    <br>
                    <br>
                    Upstream Linux kernel<br>
                    <br>
                    Linus Torvalds has directly stated that he wants
                    zero-init:<br>
                    "So I'd like the zeroing of local variables to be a
                    native compiler<br>
                    option..."<br>
                    "This, btw, is why I also think that the "initialize
                    with poison" is<br>
                    pointless and wrong."<br>
                    <a href="https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/lkml/CAHk-=wgTM+cN7zyUZacGQDv3DuuoA4LORNPWgb1Y_Z1p4iedNQ@mail.gmail.com/</a><br>
                    Unsurprisingly, I strongly agree. ;)<br>
                    <br>
                    <br>
                    GrapheneOS is using zero-init (rather than patching
                    Clang as it used to, to get<br>
                    the same result):<br>
                    <a href="https://twitter.com/DanielMicay/status/1248384468181643272" rel="noreferrer" target="_blank">https://twitter.com/DanielMicay/status/1248384468181643272</a><br>
                    <br>
                    <br>
                    GCC<br>
                    There's been mostly silence on the entire topic of
                    automatic variable<br>
                    initialization, though there have been patches
                    proposed in the past for<br>
                    zero-init:<br>
                    <a href="https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html" rel="noreferrer" target="_blank">https://gcc.gnu.org/legacy-ml/gcc-patches/2014-06/msg00615.html</a><br>
                    <br>
                    <br>
                    Apple<br>
                    <br>
                    I can't speak meaningfully here, but I've heard
                    rumors that they are<br>
                    depending on zero-init as well. Perhaps someone
                    there can clarify how<br>
                    they are using these features?<br>
                    <br>
                    <br>
                    <br>
                    So, while I understand the earlier objections to
                    zero-init from a<br>
                    "language fork" concern, I think this isn't a
                    position that can really<br>
                    stand up to the reality of how many projects are
                    using the feature (even<br>
                    via non-Clang compilers). Given that so much code is
                    going to be built<br>
                    using zero-init, what's the best way for Clang to
                    adapt here? I would<br>
                    prefer to just drop the -enable... option entirely,
                    but I think just<br>
                    renaming it would be fine too.<br>
                    <br>
                    Thoughts/flames? ;)<br>
                    <br>
                    -- <br>
                    Kees Cook<br>
                    _______________________________________________<br>
                    cfe-dev mailing list<br>
                    <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
                    <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
                  </blockquote>
                </div>
                <br>
                <fieldset></fieldset>
                <pre>_______________________________________________
cfe-dev mailing list
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
              </blockquote>
            </div>
            _______________________________________________<br>
            cfe-dev mailing list<br>
            <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
            <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
          </blockquote>
        </div>
      </div>
    </blockquote>
  </div>

_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div></div>