<div dir="ltr">this is definitely a bug in AA.<div><br></div><div><div>  225 <span class="" style="white-space:pre">   </span>      for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; ++I) {</div><div>   226 <span class="" style="white-space:pre"> </span>        const Value *Arg = *I;</div><div>   227 <span class="" style="white-space:pre">        </span>        if (!Arg->getType()->isPointerTy())</div><div>-> 228 <span class="" style="white-space:pre">    </span>          continue;</div><div>   229 <span class="" style="white-space:pre">  </span>        unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I);</div><div>   230 <span class="" style="white-space:pre">       </span>        auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, CS2ArgIdx, TLI);</div></div><div><br></div><div> AliasAnalysis.cpp:228<br></div><div><br></div><div>It ignores every argument because they are vectors of pointers, not pointers.</div><div><br></div><div><br></div><div>I'm surprised this has not broken anything before. It will never say two intrinsics with vectors of pointers mod/ref each other.</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <span dir="ltr"><<a href="mailto:davide@freebsd.org" target="_blank">davide@freebsd.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">+ Daniel Berlin<br>

<div><div class="h5"><br>

On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev<br>

<<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

> Hello everyone,<br>

><br>

> I think I have found an gvn / alias analysis related bug, but before opening<br>

> an issue on the tracker I wanted to see if I am missing something. I have<br>

> the following testcase:<br>

><br>

>> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out)<br>

>> {<br>

>> entry:<br>

>>   ; Just some temporary storage<br>

>>   %tmp.0 = alloca i32<br>

>>   %tmp.1 = alloca i32<br>

>>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0<br>

>>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1<br>

>>   ; Read from in1 and in2<br>

>>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, i32<br>

>> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1<br>

>>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, i32<br>

>> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1<br>

>>   ; Store in1 to the allocas<br>

>>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> %tmp,<br>

>> i32 1, <2 x i1> <i1 true, i1 true>);<br>

>>   ; Read in1 from the allocas<br>

>>   %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32<br>

>> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1<br>

>>   ; Store in2 to the allocas<br>

>>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> %tmp,<br>

>> i32 1, <2 x i1> <i1 true, i1 true>);<br>

>>   ; Read in2 from the allocas<br>

>>   %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32<br>

>> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1<br>

>>   ; Store in2 to out for good measure<br>

>>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0<br>

>>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1<br>

>>   store i32 %tmp.v.1.0, i32* %out<br>

>>   %out.1 = getelementptr i32, i32* %out, i32 1<br>

>>   store i32 %tmp.v.1.1, i32* %out.1<br>

>>   ret void<br>

>> }<br>

><br>

><br>

> It uses a masked scatter operation to store a value to the two allocas and<br>

> then uses a masked gather operation to read that same value. This is done<br>

> twice in a row, with two different values. If I run this code through the<br>

> GVN pass, the second gather (%tmp.v.1) will be deemed to be the same as the<br>

> first gather (%tmp.v.0) and it will be removed. After some debugging, I<br>

> realized that this is happening because the Memory Dependence Analysis<br>

> returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though the scatter<br>

> call in between changes the value stored at %tmp. This, in turn, is<br>

> happening because the alias analysis is returning NoModRef for the %tmp.v.1<br>

> and second scatter callsites. The resulting IR produces the wrong result:<br>

><br>

>> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out)<br>

>> {<br>

>> entry:<br>

>>   %tmp.0 = alloca i32<br>

>>   %tmp.1 = alloca i32<br>

>>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0<br>

>>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1<br>

>>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, i32<br>

>> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1<br>

>>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, i32<br>

>> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1<br>

>>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> %tmp,<br>

>> i32 1, <2 x i1> <i1 true, i1 true>)<br>

>>   %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32<br>

>> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1<br>

>>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> %tmp,<br>

>> i32 1, <2 x i1> <i1 true, i1 true>)<br>

>>   ; The call to masked.gather is gone and now we are using the old value<br>

>> (%tmp.v.0)<br>

>>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0<br>

>>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1<br>

>>   store i32 %tmp.v.1.0, i32* %out<br>

>>   %out.1 = getelementptr i32, i32* %out, i32 1<br>

>>   store i32 %tmp.v.1.1, i32* %out.1<br>

>>   ret void<br>

>> }<br>

><br>

><br>

> The old value read from %tmp is used, instead of the new one. I tested this<br>

> code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4) with the<br>

> same result.<br>

><br>

> I should mention that if I replace the second scatter with stores, the issue<br>

> persists. The only way to avoid it is to replace all scatter/gather<br>

> intrinsics with equivalent sets of store/load, in which case the MemDep<br>

> returns the correct dependencies and the GVN pass will not remove the second<br>

> set of loads.<br>

><br>

> So, my question is, is this a bug or am I doing something that I shouldn't<br>

> be in the IR? And if it is a bug, is it the AA analyses that return the<br>

> wrong result (I presume so) or should GVN handle such cases differently?<br>

><br>

> Best regards,<br>

><br>

> Chris<br>

><br>

</div></div>> ______________________________<wbr>_________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

><br>

<span class="HOEnZb"><font color="#888888"><br>

<br>

<br>

--<br>

Davide<br>

<br>

"There are no solved problems; there are only problems that are more<br>

or less solved" -- Henri Poincare<br>

</font></span></blockquote></div><br></div>