<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 17, 2015 at 5:30 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><div><div class="h5">

    On 07/17/2015 04:56 PM, Richard Smith wrote:<br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">On Fri, Jul 17, 2015 at 3:23 PM, John

            McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div style="word-wrap:break-word">

                <div>

                  <div>

                    <div>

                      <blockquote type="cite">

                        <div>On Jul 17, 2015, at 2:49 PM, Richard Smith

                          <<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>>

                          wrote:</div>

                        <div>

                          <div dir="ltr">

                            <div class="gmail_extra">

                              <div class="gmail_quote">On Fri, Jul 17,

                                2015 at 2:05 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>

                                wrote:<br>

                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                                  <div bgcolor="#FFFFFF" text="#000000">

                                    <div>

                                      <div> <br>

                                        <br>

                                        <div>On 07/16/2015 02:38 PM,

                                          Richard Smith wrote:<br>

                                        </div>

                                        <blockquote type="cite">

                                          <div dir="ltr">

                                            <div class="gmail_extra">

                                              <div class="gmail_quote">On

                                                Thu, Jul 16, 2015 at

                                                2:03 PM, John McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span>

                                                wrote:<br>

                                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                                                  <div style="word-wrap:break-word">

                                                    <div>

                                                      <div>

                                                        <div>

                                                          <blockquote type="cite">

                                                          <div>On Jul

                                                          16, 2015, at

                                                          11:46 AM,

                                                          Richard Smith

                                                          <<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>>

                                                          wrote:</div>

                                                          <div>

                                                          <div dir="ltr">

                                                          <div class="gmail_extra">

                                                          <div class="gmail_quote">On

                                                          Thu, Jul 16,

                                                          2015 at 11:29

                                                          AM, John

                                                          McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span>

                                                          wrote:<br>

                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>>

                                                          On Jul 15,

                                                          2015, at 10:11

                                                          PM, Hal Finkel

                                                          <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>

                                                          wrote:<br>

                                                          ><br>

                                                          > Hi

                                                          everyone,<br>

                                                          ><br>

                                                          > C++11

                                                          added features

                                                          that allow for

                                                          certain parts

                                                          of the class

                                                          hierarchy to

                                                          be closed,

                                                          specifically

                                                          the 'final'

                                                          keyword and

                                                          the semantics

                                                          of anonymous

                                                          namespaces,

                                                          and I think we

                                                          take advantage

                                                          of these to

                                                          enhance our

                                                          ability to

                                                          perform

                                                          devirtualization.

                                                          For example,

                                                          given this

                                                          situation:<br>

                                                          ><br>

                                                          > struct

                                                          Base {<br>

                                                          >  virtual

                                                          void foo() =

                                                          0;<br>

                                                          > };<br>

                                                          ><br>

                                                          > void

                                                          external();<br>

                                                          > struct

                                                          Final final :

                                                          Base {<br>

                                                          >  void

                                                          foo() {<br>

                                                          >   

                                                          external();<br>

                                                          >  }<br>

                                                          > };<br>

                                                          ><br>

                                                          > void

                                                          dispatch(Base

                                                          *B) {<br>

                                                          > 

                                                          B->foo();<br>

                                                          > }<br>

                                                          ><br>

                                                          > void

                                                          opportunity(Final

                                                          *F) {<br>

                                                          > 

                                                          dispatch(F);<br>

                                                          > }<br>

                                                          ><br>

                                                          > When we

                                                          optimize this

                                                          code, we do

                                                          the expected

                                                          thing and

                                                          inline

                                                          'dispatch'

                                                          into

                                                          'opportunity'

                                                          but we don't

                                                          devirtualize

                                                          the call to

                                                          foo(). The

                                                          fact that we

                                                          know what the

                                                          vtable of F is

                                                          at that

                                                          callsite is

                                                          not exploited.

                                                          To a lesser

                                                          extent, we can

                                                          do similar

                                                          things for

                                                          final virtual

                                                          methods, and

                                                          derived

                                                          classes in

                                                          anonymous

                                                          namespaces

                                                          (because Clang

                                                          could

                                                          determine

                                                          whether or not

                                                          a class (or

                                                          method) there

                                                          is effectively

                                                          final).<br>

                                                          ><br>

                                                          > One

                                                          possibility

                                                          might be to

                                                          @llvm.assume

                                                          to say

                                                          something

                                                          about what the

                                                          vtable ptr of

                                                          F might

                                                          be/contain

                                                          should it be

                                                          needed later

                                                          when we emit

                                                          the initial IR

                                                          for

                                                          'opportunity'

                                                          (and then

                                                          teach the

                                                          optimizer to

                                                          use that

                                                          information),

                                                          but I'm not at

                                                          all sure

                                                          that's the

                                                          best solution.

                                                          Thoughts?<br>

                                                          <br>

                                                          </span>The

                                                          problem with

                                                          any sort of

                                                          @llvm.assume-encoded

                                                          information

                                                          about memory

                                                          contents is

                                                          that C++ does

                                                          actually allow

                                                          you to replace

                                                          objects in

                                                          memory, up to

                                                          and including

                                                          stuff like:<br>

                                                          <br>

                                                          {<br>

                                                            MyClass c;<br>

                                                          <br>

                                                            // Reuse the

                                                          storage

                                                          temporarily. 

                                                          UB to access

                                                          the object

                                                          through ‘c’

                                                          now.<br>

                                                          c.~MyClass();<br>

                                                            auto c2 =

                                                          new (&c)

                                                          MyOtherClass();<br>

                                                          <br>

                                                            // The

                                                          storage has to

                                                          contain a

                                                          ‘MyClass’ when

                                                          it goes out of

                                                          scope.<br>

                                                          c2->~MyOtherClass();<br>

                                                            new (&c)

                                                          MyClass();<br>

                                                          }<br>

                                                          <br>

                                                          The standard

                                                          frontend

                                                          devirtualization

                                                          optimizations

                                                          are permitted

                                                          under a couple

                                                          of different

                                                          language

                                                          rules,

                                                          specifically

                                                          that:<br>

                                                          1. If you

                                                          access an

                                                          object through

                                                          an l-value of

                                                          a type, it has

                                                          to dynamically

                                                          be an object

                                                          of that type

                                                          (potentially a

                                                          subobject).<br>

                                                          2. Object

                                                          replacement as

                                                          above only

                                                          “forwards”

                                                          existing

                                                          formal

                                                          references

                                                          under specific

                                                          conditions,

                                                          e.g. the

                                                          dynamic type

                                                          has to be the

                                                          same, ‘const’

                                                          members have

                                                          to have the

                                                          same value,

                                                          etc.  Using an

                                                          unforwarded

                                                          reference

                                                          (like the name

                                                          of the local

                                                          variable ‘c’

                                                          above) doesn’t

                                                          formally refer

                                                          to a valid

                                                          object and

                                                          thus has

                                                          undefined

                                                          behavior.<br>

                                                          <br>

                                                          You can apply

                                                          those rules

                                                          much more

                                                          broadly than

                                                          the frontend

                                                          does, of

                                                          course; but

                                                          those are the

                                                          language tools

                                                          you get.</blockquote>

                                                          <div><br>

                                                          </div>

                                                          <div>Right.

                                                          Our current

                                                          plan for

                                                          modelling this

                                                          is:</div>

                                                          <div><br>

                                                          </div>

                                                          <div>1) Change

                                                          the meaning of

                                                          the existing

                                                          !invariant.load

                                                          metadata (or

                                                          add another

                                                          parallel

                                                          metadata kind)

                                                          so that it

                                                          allows

                                                          load-load

                                                          forwarding

                                                          (even if the

                                                          memory is not

                                                          known to be

                                                          unmodified

                                                          between the

                                                          loads) if:</div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </blockquote>

                                                          <div><br>

                                                          </div>

                                                        </div>

                                                      </div>

                                                      invariant.load

                                                      currently allows

                                                      the load to be

                                                      reordered pretty

                                                      aggressively, so I

                                                      think you need a

                                                      new metadata.</div>

                                                  </div>

                                                </blockquote>

                                                <div><br>

                                                </div>

                                                <div>Our thoughts were:</div>

                                                <div>1) The existing

                                                  !invariant.load is

                                                  redundant because it's

                                                  exactly equivalent to

                                                  a call to

                                                  @llvm.invariant.start

                                                  and a load.</div>

                                                <div>2) The new

                                                  semantics are a more

                                                  strict form of the old

                                                  semantics, so no

                                                  special action is

                                                  required to upgrade

                                                  old IR.</div>

                                                <div>... so changing the

                                                  meaning of the

                                                  existing metadata

                                                  seemed preferable to

                                                  adding a new,

                                                  similar-but-not-quite-identical,

                                                  form of the metadata.

                                                  But either way seems

                                                  fine.</div>

                                              </div>

                                            </div>

                                          </div>

                                        </blockquote>

                                      </div>

                                    </div>

                                    I'm going to argue pretty strongly

                                    in favour of the new form of

                                    metadata.  We've spent a lot of time

                                    getting !invariant.load working well

                                    for use cases like the "length"

                                    field in a Java array and I'd really

                                    hate to give that up.<br>

                                    <br>

                                    (One way of framing this is that the

                                    current !invariant.load gives a

                                    guarantee that there can't be a

                                    @llvm.invariant.end call anywhere in

                                    the program and that any

                                    @llvm.invariant.start occurs outside

                                    the visible scope of the compilation

                                    unit (Module, LTO, what have you)

                                    and must have executed before any

                                    code contained in said module which

                                    can describe the memory location can

                                    execute.  FYI, that last bit of

                                    strange wording is to allow

                                    initialization inside a malloc like

                                    function which returns a noalias

                                    pointer.)<br>

                                  </div>

                                </blockquote>

                                <div><br>

                                </div>

                                <div>I had overlooked that

                                  !invariant.load also applies for loads

                                  /before/ the invariant load. I agree

                                  that this is different both from what

                                  we're proposing and from what you can

                                  achieve with @llvm.invariant.start. I

                                  would expect that you can use our

                                  metadata for the length in a Java

                                  array -- it seems like it'd be

                                  straightforward for you to arrange

                                  that all loads of the array field have

                                  the metadata (and that you use the

                                  same operand on all of them) -- but

                                  there's no real motivation behind

                                  reusing the existing metadata besides

                                  simplicity and cleanliness.</div>

                                <div><br>

                                </div>

                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                                  <div bgcolor="#FFFFFF" text="#000000">

                                    I'm definitely open to working

                                    together on a revised version of a

                                    more general invariant mechanism. 

                                    In particular, we don't have a good

                                    way of modelling Java's "final"

                                    fields* in the IR today since the

                                    initialization logic may be visible

                                    to the compiler.  Coming up with

                                    something which supports both use

                                    cases would be really useful.<br>

                                  </div>

                                </blockquote>

                                <div><br>

                                </div>

                                <div>This seems like something that our

                                  proposed mechanism may be able to

                                  support; we intend to use it for const

                                  and reference data members in C++,

                                  though the semantics of those are not

                                  quite the same.</div>

                              </div>

                            </div>

                          </div>

                        </div>

                      </blockquote>

                      <div><br>

                      </div>

                    </div>

                  </div>

                  ObjC (and Swift, and probably a number of other

                  languages) has a optimization opportunity where

                  there’s a global variable that’s known to be constant

                  after its initialization.  (For the initiated, I’m

                  talking here primarily about ivar offset variables.)

                   However, that initialization is run lazily, and it’s

                  only at specific points within the program that we can

                  guarantee that it’s already been performed.  (Namely,

                  before ivar accesses or after message sends to the

                  class (but not to instances, because of nil).)  Those

                  points usually guarantee the initialization of more

                  than one variable, and contrariwise, there are often

                  several such points that would each individually

                  suffice to establish the guarantee for a particular

                  load, allowing it to be hoisted/reordered/combined at

                  will.</div>

                <div><br>

                </div>

                <div>So e.g.</div>

                <div><br>

                </div>

                <div>  if (cond) {</div>

                <div>    // Here there’s an operation that proves to us

                  that A, B, and C are initialized.</div>

                <div>  } else {</div>

                <div>    // Here there’s an operation that proves it for

                  just A and B.</div>

                <div>  }</div>

                <div><br>

                </div>

                <div>  for (;;) {</div>

                <div>    // Here we load A.  This should be hoist able

                  out of this loop, independently of whatever else

                  happens in this loop.</div>

                <div>  }</div>

                <div><br>

                </div>

                <div>This is actually the situation where ObjC currently

                  uses !invariant.load, except that we can only safely

                  use it in specific functions (ObjC method

                  implementations) that guarantee initialization before

                  entry and which can never be inlined.</div>

                <div><br>

                </div>

                <div>Now, I think something like invariant.start would

                  help with this, except that I’m concerned that we’d

                  have to eagerly emit what might be dozens of

                  invariant.starts at every point that established the

                  guarantee, which would be pretty wasteful even for

                  optimized builds.  If we’re designing new metadata

                  anyway, or generalizing existing metadata, can we try

                  to make this more scalable, so that e.g. I can use a

                  single intrinsic with a list of the invariants it

                  establishes, ideally in a way that’s sharable between

                  calls?</div>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>It seems we have three different use cases:</div>

            <div><br>

            </div>

            <div>1) This invariant applies to this load and all future

              loads of this pointer (ObjC / Swift constants, Java final

              members)</div>

            <div>2) This invariant applies to this load and all past and

              future loads of this pointer (Java array length)</div>

          </div>

        </div>

      </div>

    </blockquote></div></div>

    Slight tweak here: it's not "all past" *loads*.  It's all (past,

    present, future) instruction *positions* at which this location

    was/is dereferenceable.  There doesn't actually have to be a load

    there (yet).  <br>

    <br>

    It's specifically that property which allows aggressive hoisting

    (e.g. in LICM).  <br><span class="">

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div>3) This invariant applies to this load and all (past

              and) future loads of this pointer that also have metadata

              with the same operand (C++ vptr, const members, reference

              members)</div>

          </div>

        </div>

      </div>

    </blockquote></span>

    (I'm assuming that by "this pointer" you actually mean "this

    abstract memory location".)  <br></div></blockquote><div><br></div><div>No; that would remove the ability to have something like @llvm.invariant.barrier. This is a property of the pointer value, not a property of the storage location. In this regard, it seems fundamentally different to the existing !invariant.load.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    For my use cases, I can replace (1) with (3) as long as I give all

    such loads the same operand, I still need (2) as a distinct notion

    though.  Definition (1) and (3) don't include the ability to do

    aggressive hoisting.  If we could define it in a way it did, we

    might be able to combine all three.  <br><span class="">

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div>We could use (1) or (2) for C++, but that would require

              us to insert a lot more invariant barriers (for all

              dynamic type changes, not just for a change to a type with

              a constant member), and would be much less forgiving of

              user errors. How general a system are you imagining?</div>

          </div>

        </div>

      </div>

    </blockquote></span>

    (2) today works specifically because there is no such thing as a

    start/end for the invariantness.  If there were, we'd have to solve

    a potentially hard IPO problem for basic hoisting in LICM.  <br>

    <br>

    I feel we're going to end up needing something which supports the

    following combinations:<br>

    - No start, no end (2 above)<br>

    - Possible start, no end (1 above)<br>

    - Possible start, possible end, possible start, possible end, ....

    (what @llvm.invariant.start/end try to be and fail at)<br>

    <br>

    (Note that for all of these, dereferenceability is orthogonal to the

    invariantness of the memory location.)<br>

    <br>

    Whether those are all the same mechanism or not, who knows.<br>

    <br>

    To throw yet another wrinkle in, the backend has a separate notion

    of invarantness.  It assumes (2), plus derefenceability within the

    *entire* function.  It's for this reason that many !invariant.loads

    aren't marked invariant in the backend.  <br>

    <br>

    p.s. Before this goes too much further, we should move this to it's

    own thread, and CC llvmdev rather than just cfe-dev.  Many

    interested folks (i.e. other frontends) might not be subscribed to

    cfe-dev.</div></blockquote><div><br></div><div>A proper proposal is on the way (which will get its own thread on llvmdev); we were only responding here because Hal happened to bring up the topic before we were ready with our proposal :-)</div></div></div></div>