<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
I'm wondering if there's a problematic interaction with CSE here.
Consider this example is pseudo LLVM IR:<br>
v1 = load i64, %p, !invariant.group !Type1<br>
; I called destructor/placement new for the same type, but that
optimized entirely away<br>
p2 = invariant.group.barrier(p1)<br>
if (p1 != p2) return.<br>
store i64 0, %p2, !invariant.group !Type1<br>
v2 = load i64, %p2, !invariant.group !Type1<br>
ret i64 v1 - v2<br>
<br>
(Assume that !Type is used to describe a write once integer field
within some class. Not all instances have the same integer value.)<br>
<br>
Having CSE turn this into:<br>
v1 = load i64, %p, !invariant.group !Type1<br>
p2 = invariant.group.barrier(p1)<br>
if (p1 != p2) return.<br>
store i64 0, %p1, !invariant.group !Type1<br>
v2 = load i64, %p1, !invariant.group !Type1<br>
ret i64 v1 - v2<br>
<br>
And then GVN turn this into:<br>
v1 = load i64, %p, !invariant.group !Type1<br>
p2 = invariant.group.barrier(p1)<br>
if (p1 != p2) return.<br>
ret i64 v1 - v1 (-> 0)<br>
<br>
This doesn't seem like the result I'd expect. Is there something
about my initial IR which is wrong/invalid in some way? Is the
invariant.group required to be specific to a single bitpattern
across all usages within a function/module/context? That would be
reasonable, but I don't think is explicit said right now. It also
makes !invariant.group effectively useless for describing constant
fields which are constant per instance rather than per-class. <br></div></blockquote><div><br></div><div>Yes, this family of examples scares me. :) It seems we've discovered a new device testing IR soundness. We used it to build a test case that shows that 'readonly' on arguments without 'nocapture' doesn't let you forward stores across such a call.</div><div><br></div><div>Consider this pseudo-IR and some possible transforms that I would expect to be semantics preserving:</div><div><br></div><div>void f(i32* readonly %a, i32* %b) {</div><div> llvm.assume(%a == %b)</div><div> store i32 42, i32* %b<br>}</div><div> ...</div> %p = alloca i32<div> store i32 13, i32* %p</div><div> call f(i32* readonly %p, i32* %p)</div><div> %r = load i32, i32* %p</div><div><br></div><div><div>; Propagate llvm.assume info</div><div>void f(i32* readonly %a, i32* %b) {</div><div><div> store i32 42, i32* %a<br></div></div><div>}<br></div></div><div><div> ...</div> %p = alloca i32<div> store i32 13, i32* %p</div><div> call f(i32* readonly %p, i32* %p)</div><div> %r = load i32, i32* %p</div></div><div><br></div><div>; Delete dead args</div><div>void f(i32* readonly %a) {</div><div> store i32 42</div><div>}</div><div><div> ...</div><div> %p = alloca i32</div><div> store i32 13, i32* %p</div><div> call f(i32* readonly %p)</div><div> %r = load i32, i32* %p</div></div><div><br></div><div>; Forward store %p to load %p, since the only use of %p is readonly</div><div><div>void f(i32* readonly %a) {</div><div> store i32 42</div><div>}</div><div><div> ...</div> %p = alloca i32<div> call f(i32* readonly %p)</div><div> %r = i32 13</div></div></div><div><br></div><div>Today LLVM will not do the final transform because it requires readonly on the entire function, or nocapture on the argument. nocapture cannot be inferred due to the assume comparison.</div></div></div></div>