<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 17, 2015 at 5:30 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000"><div><div class="h5">
    On 07/17/2015 04:56 PM, Richard Smith wrote:<br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">On Fri, Jul 17, 2015 at 3:23 PM, John
            McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div style="word-wrap:break-word">
                <div>
                  <div>
                    <div>
                      <blockquote type="cite">
                        <div>On Jul 17, 2015, at 2:49 PM, Richard Smith
                          <<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>>
                          wrote:</div>
                        <div>
                          <div dir="ltr">
                            <div class="gmail_extra">
                              <div class="gmail_quote">On Fri, Jul 17,
                                2015 at 2:05 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>
                                wrote:<br>
                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                  <div bgcolor="#FFFFFF" text="#000000">
                                    <div>
                                      <div> <br>
                                        <br>
                                        <div>On 07/16/2015 02:38 PM,
                                          Richard Smith wrote:<br>
                                        </div>
                                        <blockquote type="cite">
                                          <div dir="ltr">
                                            <div class="gmail_extra">
                                              <div class="gmail_quote">On
                                                Thu, Jul 16, 2015 at
                                                2:03 PM, John McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span>
                                                wrote:<br>
                                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                  <div style="word-wrap:break-word">
                                                    <div>
                                                      <div>
                                                        <div>
                                                          <blockquote type="cite">
                                                          <div>On Jul
                                                          16, 2015, at
                                                          11:46 AM,
                                                          Richard Smith
                                                          <<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>>

                                                          wrote:</div>
                                                          <div>
                                                          <div dir="ltr">
                                                          <div class="gmail_extra">
                                                          <div class="gmail_quote">On
                                                          Thu, Jul 16,
                                                          2015 at 11:29
                                                          AM, John
                                                          McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>>
                                                          On Jul 15,
                                                          2015, at 10:11
                                                          PM, Hal Finkel
                                                          <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>
                                                          wrote:<br>
                                                          ><br>
                                                          > Hi
                                                          everyone,<br>
                                                          ><br>
                                                          > C++11
                                                          added features
                                                          that allow for
                                                          certain parts
                                                          of the class
                                                          hierarchy to
                                                          be closed,
                                                          specifically
                                                          the 'final'
                                                          keyword and
                                                          the semantics
                                                          of anonymous
                                                          namespaces,
                                                          and I think we
                                                          take advantage
                                                          of these to
                                                          enhance our
                                                          ability to
                                                          perform
                                                          devirtualization.
                                                          For example,
                                                          given this
                                                          situation:<br>
                                                          ><br>
                                                          > struct
                                                          Base {<br>
                                                          >  virtual
                                                          void foo() =
                                                          0;<br>
                                                          > };<br>
                                                          ><br>
                                                          > void
                                                          external();<br>
                                                          > struct
                                                          Final final :
                                                          Base {<br>
                                                          >  void
                                                          foo() {<br>
                                                          >   
                                                          external();<br>
                                                          >  }<br>
                                                          > };<br>
                                                          ><br>
                                                          > void
                                                          dispatch(Base
                                                          *B) {<br>
                                                          > 
                                                          B->foo();<br>
                                                          > }<br>
                                                          ><br>
                                                          > void
                                                          opportunity(Final
                                                          *F) {<br>
                                                          > 
                                                          dispatch(F);<br>
                                                          > }<br>
                                                          ><br>
                                                          > When we
                                                          optimize this
                                                          code, we do
                                                          the expected
                                                          thing and
                                                          inline
                                                          'dispatch'
                                                          into
                                                          'opportunity'
                                                          but we don't
                                                          devirtualize
                                                          the call to
                                                          foo(). The
                                                          fact that we
                                                          know what the
                                                          vtable of F is
                                                          at that
                                                          callsite is
                                                          not exploited.
                                                          To a lesser
                                                          extent, we can
                                                          do similar
                                                          things for
                                                          final virtual
                                                          methods, and
                                                          derived
                                                          classes in
                                                          anonymous
                                                          namespaces
                                                          (because Clang
                                                          could
                                                          determine
                                                          whether or not
                                                          a class (or
                                                          method) there
                                                          is effectively
                                                          final).<br>
                                                          ><br>
                                                          > One
                                                          possibility
                                                          might be to
                                                          @llvm.assume
                                                          to say
                                                          something
                                                          about what the
                                                          vtable ptr of
                                                          F might
                                                          be/contain
                                                          should it be
                                                          needed later
                                                          when we emit
                                                          the initial IR
                                                          for
                                                          'opportunity'
                                                          (and then
                                                          teach the
                                                          optimizer to
                                                          use that
                                                          information),
                                                          but I'm not at
                                                          all sure
                                                          that's the
                                                          best solution.
                                                          Thoughts?<br>
                                                          <br>
                                                          </span>The
                                                          problem with
                                                          any sort of
                                                          @llvm.assume-encoded
                                                          information
                                                          about memory
                                                          contents is
                                                          that C++ does
                                                          actually allow
                                                          you to replace
                                                          objects in
                                                          memory, up to
                                                          and including
                                                          stuff like:<br>
                                                          <br>
                                                          {<br>
                                                            MyClass c;<br>
                                                          <br>
                                                            // Reuse the
                                                          storage
                                                          temporarily. 
                                                          UB to access
                                                          the object
                                                          through ‘c’
                                                          now.<br>
                                                           
                                                          c.~MyClass();<br>
                                                            auto c2 =
                                                          new (&c)
                                                          MyOtherClass();<br>
                                                          <br>
                                                            // The
                                                          storage has to
                                                          contain a
                                                          ‘MyClass’ when
                                                          it goes out of
                                                          scope.<br>
                                                           
                                                          c2->~MyOtherClass();<br>
                                                            new (&c)
                                                          MyClass();<br>
                                                          }<br>
                                                          <br>
                                                          The standard
                                                          frontend
                                                          devirtualization
                                                          optimizations
                                                          are permitted
                                                          under a couple
                                                          of different
                                                          language
                                                          rules,
                                                          specifically
                                                          that:<br>
                                                          1. If you
                                                          access an
                                                          object through
                                                          an l-value of
                                                          a type, it has
                                                          to dynamically
                                                          be an object
                                                          of that type
                                                          (potentially a
                                                          subobject).<br>
                                                          2. Object
                                                          replacement as
                                                          above only
                                                          “forwards”
                                                          existing
                                                          formal
                                                          references
                                                          under specific
                                                          conditions,
                                                          e.g. the
                                                          dynamic type
                                                          has to be the
                                                          same, ‘const’
                                                          members have
                                                          to have the
                                                          same value,
                                                          etc.  Using an
                                                          unforwarded
                                                          reference
                                                          (like the name
                                                          of the local
                                                          variable ‘c’
                                                          above) doesn’t
                                                          formally refer
                                                          to a valid
                                                          object and
                                                          thus has
                                                          undefined
                                                          behavior.<br>
                                                          <br>
                                                          You can apply
                                                          those rules
                                                          much more
                                                          broadly than
                                                          the frontend
                                                          does, of
                                                          course; but
                                                          those are the
                                                          language tools
                                                          you get.</blockquote>
                                                          <div><br>
                                                          </div>
                                                          <div>Right.
                                                          Our current
                                                          plan for
                                                          modelling this
                                                          is:</div>
                                                          <div><br>
                                                          </div>
                                                          <div>1) Change
                                                          the meaning of
                                                          the existing
                                                          !invariant.load
                                                          metadata (or
                                                          add another
                                                          parallel
                                                          metadata kind)
                                                          so that it
                                                          allows
                                                          load-load
                                                          forwarding
                                                          (even if the
                                                          memory is not
                                                          known to be
                                                          unmodified
                                                          between the
                                                          loads) if:</div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <div><br>
                                                          </div>
                                                        </div>
                                                      </div>
                                                      invariant.load
                                                      currently allows
                                                      the load to be
                                                      reordered pretty
                                                      aggressively, so I
                                                      think you need a
                                                      new metadata.</div>
                                                  </div>
                                                </blockquote>
                                                <div><br>
                                                </div>
                                                <div>Our thoughts were:</div>
                                                <div>1) The existing
                                                  !invariant.load is
                                                  redundant because it's
                                                  exactly equivalent to
                                                  a call to
                                                  @llvm.invariant.start
                                                  and a load.</div>
                                                <div>2) The new
                                                  semantics are a more
                                                  strict form of the old
                                                  semantics, so no
                                                  special action is
                                                  required to upgrade
                                                  old IR.</div>
                                                <div>... so changing the
                                                  meaning of the
                                                  existing metadata
                                                  seemed preferable to
                                                  adding a new,
                                                  similar-but-not-quite-identical,
                                                  form of the metadata.
                                                  But either way seems
                                                  fine.</div>
                                              </div>
                                            </div>
                                          </div>
                                        </blockquote>
                                      </div>
                                    </div>
                                    I'm going to argue pretty strongly
                                    in favour of the new form of
                                    metadata.  We've spent a lot of time
                                    getting !invariant.load working well
                                    for use cases like the "length"
                                    field in a Java array and I'd really
                                    hate to give that up.<br>
                                    <br>
                                    (One way of framing this is that the
                                    current !invariant.load gives a
                                    guarantee that there can't be a
                                    @llvm.invariant.end call anywhere in
                                    the program and that any
                                    @llvm.invariant.start occurs outside
                                    the visible scope of the compilation
                                    unit (Module, LTO, what have you)
                                    and must have executed before any
                                    code contained in said module which
                                    can describe the memory location can
                                    execute.  FYI, that last bit of
                                    strange wording is to allow
                                    initialization inside a malloc like
                                    function which returns a noalias
                                    pointer.)<br>
                                  </div>
                                </blockquote>
                                <div><br>
                                </div>
                                <div>I had overlooked that
                                  !invariant.load also applies for loads
                                  /before/ the invariant load. I agree
                                  that this is different both from what
                                  we're proposing and from what you can
                                  achieve with @llvm.invariant.start. I
                                  would expect that you can use our
                                  metadata for the length in a Java
                                  array -- it seems like it'd be
                                  straightforward for you to arrange
                                  that all loads of the array field have
                                  the metadata (and that you use the
                                  same operand on all of them) -- but
                                  there's no real motivation behind
                                  reusing the existing metadata besides
                                  simplicity and cleanliness.</div>
                                <div><br>
                                </div>
                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                  <div bgcolor="#FFFFFF" text="#000000">
                                    I'm definitely open to working
                                    together on a revised version of a
                                    more general invariant mechanism. 
                                    In particular, we don't have a good
                                    way of modelling Java's "final"
                                    fields* in the IR today since the
                                    initialization logic may be visible
                                    to the compiler.  Coming up with
                                    something which supports both use
                                    cases would be really useful.<br>
                                  </div>
                                </blockquote>
                                <div><br>
                                </div>
                                <div>This seems like something that our
                                  proposed mechanism may be able to
                                  support; we intend to use it for const
                                  and reference data members in C++,
                                  though the semantics of those are not
                                  quite the same.</div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                      <div><br>
                      </div>
                    </div>
                  </div>
                  ObjC (and Swift, and probably a number of other
                  languages) has a optimization opportunity where
                  there’s a global variable that’s known to be constant
                  after its initialization.  (For the initiated, I’m
                  talking here primarily about ivar offset variables.)
                   However, that initialization is run lazily, and it’s
                  only at specific points within the program that we can
                  guarantee that it’s already been performed.  (Namely,
                  before ivar accesses or after message sends to the
                  class (but not to instances, because of nil).)  Those
                  points usually guarantee the initialization of more
                  than one variable, and contrariwise, there are often
                  several such points that would each individually
                  suffice to establish the guarantee for a particular
                  load, allowing it to be hoisted/reordered/combined at
                  will.</div>
                <div><br>
                </div>
                <div>So e.g.</div>
                <div><br>
                </div>
                <div>  if (cond) {</div>
                <div>    // Here there’s an operation that proves to us
                  that A, B, and C are initialized.</div>
                <div>  } else {</div>
                <div>    // Here there’s an operation that proves it for
                  just A and B.</div>
                <div>  }</div>
                <div><br>
                </div>
                <div>  for (;;) {</div>
                <div>    // Here we load A.  This should be hoist able
                  out of this loop, independently of whatever else
                  happens in this loop.</div>
                <div>  }</div>
                <div><br>
                </div>
                <div>This is actually the situation where ObjC currently
                  uses !invariant.load, except that we can only safely
                  use it in specific functions (ObjC method
                  implementations) that guarantee initialization before
                  entry and which can never be inlined.</div>
                <div><br>
                </div>
                <div>Now, I think something like invariant.start would
                  help with this, except that I’m concerned that we’d
                  have to eagerly emit what might be dozens of
                  invariant.starts at every point that established the
                  guarantee, which would be pretty wasteful even for
                  optimized builds.  If we’re designing new metadata
                  anyway, or generalizing existing metadata, can we try
                  to make this more scalable, so that e.g. I can use a
                  single intrinsic with a list of the invariants it
                  establishes, ideally in a way that’s sharable between
                  calls?</div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>It seems we have three different use cases:</div>
            <div><br>
            </div>
            <div>1) This invariant applies to this load and all future
              loads of this pointer (ObjC / Swift constants, Java final
              members)</div>
            <div>2) This invariant applies to this load and all past and
              future loads of this pointer (Java array length)</div>
          </div>
        </div>
      </div>
    </blockquote></div></div>
    Slight tweak here: it's not "all past" *loads*.  It's all (past,
    present, future) instruction *positions* at which this location
    was/is dereferenceable.  There doesn't actually have to be a load
    there (yet).  <br>
    <br>
    It's specifically that property which allows aggressive hoisting
    (e.g. in LICM).  <br><span class="">
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>3) This invariant applies to this load and all (past
              and) future loads of this pointer that also have metadata
              with the same operand (C++ vptr, const members, reference
              members)</div>
          </div>
        </div>
      </div>
    </blockquote></span>
    (I'm assuming that by "this pointer" you actually mean "this
    abstract memory location".)  <br></div></blockquote><div><br></div><div>No; that would remove the ability to have something like @llvm.invariant.barrier. This is a property of the pointer value, not a property of the storage location. In this regard, it seems fundamentally different to the existing !invariant.load.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
    For my use cases, I can replace (1) with (3) as long as I give all
    such loads the same operand, I still need (2) as a distinct notion
    though.  Definition (1) and (3) don't include the ability to do
    aggressive hoisting.  If we could define it in a way it did, we
    might be able to combine all three.  <br><span class="">
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>We could use (1) or (2) for C++, but that would require
              us to insert a lot more invariant barriers (for all
              dynamic type changes, not just for a change to a type with
              a constant member), and would be much less forgiving of
              user errors. How general a system are you imagining?</div>
          </div>
        </div>
      </div>
    </blockquote></span>
    (2) today works specifically because there is no such thing as a
    start/end for the invariantness.  If there were, we'd have to solve
    a potentially hard IPO problem for basic hoisting in LICM.  <br>
    <br>
    I feel we're going to end up needing something which supports the
    following combinations:<br>
    - No start, no end (2 above)<br>
    - Possible start, no end (1 above)<br>
    - Possible start, possible end, possible start, possible end, ....
    (what @llvm.invariant.start/end try to be and fail at)<br>
    <br>
    (Note that for all of these, dereferenceability is orthogonal to the
    invariantness of the memory location.)<br>
    <br>
    Whether those are all the same mechanism or not, who knows.<br>
    <br>
    To throw yet another wrinkle in, the backend has a separate notion
    of invarantness.  It assumes (2), plus derefenceability within the
    *entire* function.  It's for this reason that many !invariant.loads
    aren't marked invariant in the backend.  <br>
    <br>
    p.s. Before this goes too much further, we should move this to it's
    own thread, and CC llvmdev rather than just cfe-dev.  Many
    interested folks (i.e. other frontends) might not be subscribed to
    cfe-dev.</div></blockquote><div><br></div><div>A proper proposal is on the way (which will get its own thread on llvmdev); we were only responding here because Hal happened to bring up the topic before we were ready with our proposal :-)</div></div></div></div>