<br><br><div class="gmail_quote">On Tue Dec 02 2014 at 10:03:23 AM Andrew Trick <<a href="mailto:atrick@apple.com">atrick@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><blockquote type="cite"><div>On Dec 1, 2014, at 3:22 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>> wrote:</div><br><div>

  
  <div bgcolor="#FFFFFF" text="#000000">

    <br>

    <div>On 12/01/2014 02:42 PM, Andrew Trick

      wrote:<br>

    </div>

    <blockquote type="cite">

      
      <br>

      <div>

        <blockquote type="cite">

          <div>On Dec 1, 2014, at 2:21 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>

            wrote:</div>

          <br>

          <div>

            
            <div bgcolor="#FFFFFF" text="#000000"> <br>

              <div>On 12/01/2014 11:14 AM,

                Andrew Trick wrote:<br>

              </div>

              <blockquote type="cite">

                
                <br>

                <div>

                  <blockquote type="cite">

                    <div>On Oct 21, 2014, at 4:03 PM, Philip

                      Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>


                      wrote:</div>

                    <br>

                    <div><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">Sanjoy

                        made a good point.  We don't actually need a new

                        variant of "invariant.start".  Simply using an

                        invariant.start with no uses gives us a notion

                        of an invariant region with no end.  (Since the

                        result doesn't escape, there can be no end

                        hidden inside a function call.)   This seems

                        like a simple notion to exploit and is strict

                        step forward from where we are, without a new

                        intrinsic.</span></div>

                  </blockquote>

                </div>

                <br>

                <div>

                  <div>I'm coming back to this because it is a

                    sticking point for me in all of the proposals that

                    were floated informally on the commits list. I would

                    really like this to work, but can't prove that it's

                    safe.</div>

                </div>

              </blockquote>

              As I said in the other thread, !invariant.load and

              llvm.invariant.* are not the same.  This thread is

              discussing the later, not the former.  The other thread is

              discussing the former, but not the later.  <br>

              <blockquote type="cite">

                <div>

                  <div><br>

                  </div>

                  <div>Given:</div>

                  <div><br>

                  </div>

                  <div>%a = newArray()</div>

                  <div>%pLen = gep(%a, <length offset>)</div>

                  <div>invariant.start(<size>, %pLen)</div>

                  <div>%len = load %pLen</div>

                  <div><br>

                  </div>

                  <div>%o = foo() // may deallocate %a</div>

                  <div>store %something</div>

                  <div>load %o</div>

                  <div><br>

                  </div>

                  <div>I want to claim that the LLVM optimizer

                    cannot do the following (but don't have a complete

                    line of reasoning yet):</div>

                  <div><br>

                  </div>

                  <div>%a = newArray()</div>

                  <div>%pLen = gep(%a, <length offset>)</div>

                  <div>invariant.start(<size>, %pLen)</div>

                  <div>%len = load %pLen</div>

                  <div><br>

                  </div>

                  <div>%o = foo() // may deallocate %a</div>

                  <div>if ptrtoint(%o) == ptrtoint(%pLen)</div>

                  <div>  load %pLen</div>

                  <div>  store %something // Oops... this might

                    alias</div>

                  <div>else</div>

                  <div>  store %something</div>

                  <div>  load %o</div>

                </div>

              </blockquote>

              Just to make sure we're on the same page, it *would* be

              legal for the optimizer to construct:<br>

              <div>%a = newArray()</div>

              <div>%pLen = gep(%a, <length offset>)</div>

              <div>invariant.start(<size>, %pLen)</div>

              <div>%len = load %pLen</div>

              <div><br>

              </div>

              <div>%o = foo() // may deallocate %a</div>

              <div>if ptrtoint(%o) == ptrtoint(%pLen)</div>

              <div>  store %something // Oops... this might

                alias<br>

                  load %pLen <--- This is now after the store<br>

              </div>

              <div>else</div>

              <div>  store %something</div>

              <div>  load %o<br>

                <br>

                Are we in agreement here?  <br>

              </div>

            </div>

          </div>

        </blockquote>

        <div><br>

        </div>

        It’s fine in the sense that memory access has not been

        reordered… yet.</div>

    </blockquote>

    I think I've actually convinced myself my example is not okay, but

    for a completely different reason.  :)  See my last comment below.<br>

    <blockquote type="cite">

      <div><br>

      </div>

      <div>But is there an implicit invariant.end(%pLen) at the call to

        foo(), at which point the array can be freed and another object

        allocated in its place? I don’t think so. </div>

    </blockquote>

    Given the current semantics, if the result of the invariant.start is

    not passed to foo, we can assume foo does not contain an

    invariant.end.  However, for foo to allocate a new object in the

    same memory, it must write to that memory.  Doing so without ending

    the previous invariant region is ill defined.  As such, your example

    isn't a valid test.  <br>

    <br>

    We have two cases:<br>

    a) The invariant region hasn't ended.  As a result, the input IR is

    ill defined and the problematic reordering can happen.  <br>

    b) The invariant region ended (and we either encountered a

    llvm.invariant.end, or saw the result of invariant.start passed to

    foo). In this case, the location is no longer invariant and the

    reordering isn't legal and wouldn't happen.<br>

    <br>

    As I think came up before, this would imply that an invariant.end

    can't be dropped without dropping it's corresponding

    invariant.start.  That's problematic.  <br>

    <br>

    <blockquote type="cite">

      <div>The way I interpreted the conclusion of this email thread is

        that invariant.start can be used without invariant.end. </div>

    </blockquote>

    We had a proposal for this, but I don't believe it's actually been

    implemented yet.  I believe this proposal is still consistent with

    everything I said above though.  <br>

    <blockquote type="cite">

      <div>Since the token returned by the intrinsic never escapes, you

        can just assume the memory is invariant at all uses. The problem

        here is that the optimizer could (in theory) introduce new uses

        of the pointer after the object lifetime. This is the same

        problem that you yourself have raised. I hoped we could sidestep

        the problem because it crops up with any representation that

        attaches invariance to a pointer value. If we have an answer for

        this, then we can easily debate other representations like

        broadening !invariant.load metadata semantics and introducing

        new intrinsics.</div>

    </blockquote>

    I agree there is a general issue here.  However, I don't actually

    think it's an issue of invariance.  Instead, I see this as being a

    problem of *derefenceability*.  Regardless of the 'invariant-ness'

    of the memory in question, it's problematic for the optimizer to

    insert uses after a deallocation because the memory transitions from

    dereferenceable to non-dereferenceable.  (i.e. Consider a malloc

    routine allocates one page per object, and a free which protects the

    page freed.)  Any transformation which inserts such a load is

    dangerous.  <br>

    <br>

    What your example really shows is that we can have two names for the

    same *memory address*.  One of those names can be dereferenceable

    while the other is not.  <br>

    <br>

    The dynamic check of the pointer values essentially allows the

    optimizer to chose which of the two names for the same physical

    address it wants to use.  I think that is valid, but it's odd to say

    the least.  The particular example you've shown isn't legal since

    the optimizer is choosing to insert a load to a non-dereferenceable

    location and thus could introduce a fault.</div></div></blockquote><div><br></div></div></div><div style="word-wrap:break-word"><div>Sorry, I didn’t follow your response. I was showing that the optimizer can theoretically substitute the use of one pointer value (%o) with another pointer value (%pLen) after checking value equivalence. Doing that creates multiple uses of the same pointer value where not all uses dereference the same memory object. To me, that means we cannot associate a pointer value/address with an object-specific attribute, like invariance, without tying it to a closed lifetime.</div></div></blockquote><div><br></div><div><br></div><div>Agreed.</div><div><br></div><div><br>This is actually one of the nice things memory ssa provides/provided GCC.  You got versioning of the global memory state for free, and lifetime info embedded directly into the IR in the normal SSA way.</div><div><br></div><div>For a given instruction, the state of the virtual uses for the instruction was sufficient to identify whether the memory state was the same, so value substitution was easy.</div><div><br></div><div>You could also do the inverse, and prove a better result than memory ssa already had given you - that the virtual uses were really the same due to non-aliasing.</div><div><br></div><div>This made value numbering loads/stores and substitution very easy.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>I wish we could prevent the optimizer from doing this.</div><div><br></div><div>-Andy</div></div><div style="word-wrap:break-word"><div><br></div></div>

</blockquote></div>