<div dir="ltr">Okay, so then it sounds like, for now, the right fix is to stop marking masked.gather and masked.scatter with intrarg* options.<div><br><div class="gmail_quote"><div dir="ltr">On Mon, Aug 29, 2016, 1:26 PM Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<p>We might have specification bug here, but we appear to implement
what we specified. argmemonly is specified as only considering
pointer typed arguments. It's behavior on vector-of-pointers is
unspecified, but would seem to fall into the same case as inttoptr
of an integer argument (i.e. explicitly undefined). We could
consider changing that.<br>
</p></div><div bgcolor="#FFFFFF" text="#000000">
<p><br>
</p>
<br>
<div>On 08/29/2016 11:59 AM, Daniel Berlin
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">+ a few others.
<div><br>
</div>
<div>After following this rabbit hole a bit, there are a lot of
mutually recursive calls, etc, that may or may not do the
right thing with vectors of pointers.</div>
<div>I can fix *this* particular bug with the attached patch.</div>
<div><br>
</div>
<div>However, it's mostly papering over stuff. Nothing seems to
know what to do with a memorylocation that is a vector of
pointers. They all expect memorylocation to be a single
pointer location.</div>
<div><br>
I would chalk it up to "luck" that this patch fixes the bug.</div>
<div><br>
</div>
<div>It's pretty clear that MemoryLocation doesn't fit the needs
of a lot of stuff anymore (we hacked AA nodes into it, and
lots of stuff now tries to figure out the invariantess of the
locations, blah blah blah), but it seems like a big job to
figure out what to replace it with that will work for these
cases.</div>
<div><br>
</div>
<div>(I'm pretty positive if we just make it MemoryLocations,
and have everything loop over the locations, the compiler will
get a lot larger and a lot slower)</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Aug 29, 2016 at 9:10 AM, Daniel
Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">it would also, for that matter, say the same
about an array of pointers.
<div><br>
</div>
<div>It's not clear to me what will break if you change
this to isPtrOrPtrVectorTy.</div>
<div><br>
</div>
<div>In fact, i know it doesn't fix this bug.</div>
<div>It's a pretty deep rabbit hole of things not quite
prepared to understand vectors of pointers.</div>
<div><br>
</div>
<div>(we prepare memorylocations of them, but memory
locations expect to be one thing, not a group of things,
etc).</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<div>
<div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Aug 29, 2016 at 8:58
AM, Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">this is definitely a bug in AA.
<div><br>
</div>
<div>
<div> 225 <span style="white-space:pre-wrap"> </span>
for (auto I = CS2.arg_begin(), E =
CS2.arg_end(); I != E; ++I) {</div>
<div> 226 <span style="white-space:pre-wrap"> </span>
const Value *Arg = *I;</div>
<div> 227 <span style="white-space:pre-wrap"> </span>
if
(!Arg->getType()->isPointerTy(<wbr>))</div>
<div>-> 228 <span style="white-space:pre-wrap"> </span>
continue;</div>
<div> 229 <span style="white-space:pre-wrap"> </span>
unsigned CS2ArgIdx =
std::distance(CS2.arg_begin(), I);</div>
<div> 230 <span style="white-space:pre-wrap"> </span>
auto CS2ArgLoc =
MemoryLocation::<wbr>getForArgument(CS2,
CS2ArgIdx, TLI);</div>
</div>
<div><br>
</div>
<div> AliasAnalysis.cpp:228<br>
</div>
<div><br>
</div>
<div>It ignores every argument because they are
vectors of pointers, not pointers.</div>
<div><br>
</div>
<div><br>
</div>
<div>I'm surprised this has not broken anything
before. It will never say two intrinsics with
vectors of pointers mod/ref each other.</div>
<div><br>
</div>
</div>
<div>
<div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Aug 29,
2016 at 7:36 AM, Davide Italiano <span dir="ltr"><<a href="mailto:davide@freebsd.org" target="_blank">davide@freebsd.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">+ Daniel
Berlin<br>
<div>
<div><br>
On Mon, Aug 29, 2016 at 3:42 PM,
Chris Sakalis via llvm-dev<br>
<<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
wrote:<br>
> Hello everyone,<br>
><br>
> I think I have found an gvn /
alias analysis related bug, but
before opening<br>
> an issue on the tracker I
wanted to see if I am missing
something. I have<br>
> the following testcase:<br>
><br>
>> define spir_kernel void
@test(<2 x i32*> %in1, <2 x
i32*> %in2, i32* %out)<br>
>> {<br>
>> entry:<br>
>> ; Just some temporary
storage<br>
>> %tmp.0 = alloca i32<br>
>> %tmp.1 = alloca i32<br>
>> %tmp.i = insertelement
<2 x i32*> undef, i32* %tmp.0,
i32 0<br>
>> %tmp = insertelement
<2 x i32*> %tmp.i, i32*
%tmp.1, i32 1<br>
>> ; Read from in1 and in2<br>
>> %in1.v = call <2 x
i32>
@llvm.masked.gather.v2i32(<2 x
i32*> %in1, i32<br>
>> 1, <2 x i1> <i1
true, i1 true>, <2 x i32>
undef) #1<br>
>> %in2.v = call <2 x
i32>
@llvm.masked.gather.v2i32(<2 x
i32*> %in2, i32<br>
>> 1, <2 x i1> <i1
true, i1 true>, <2 x i32>
undef) #1<br>
>> ; Store in1 to the
allocas<br>
>> call void
@llvm.masked.scatter.v2i32(<2 x
i32> %in1.v, <2 x i32*>
%tmp,<br>
>> i32 1, <2 x i1>
<i1 true, i1 true>);<br>
>> ; Read in1 from the
allocas<br>
>> %tmp.v.0 = call <2 x
i32>
@llvm.masked.gather.v2i32(<2 x
i32*> %tmp, i32<br>
>> 1, <2 x i1> <i1
true, i1 true>, <2 x i32>
undef) #1<br>
>> ; Store in2 to the
allocas<br>
>> call void
@llvm.masked.scatter.v2i32(<2 x
i32> %in2.v, <2 x i32*>
%tmp,<br>
>> i32 1, <2 x i1>
<i1 true, i1 true>);<br>
>> ; Read in2 from the
allocas<br>
>> %tmp.v.1 = call <2 x
i32>
@llvm.masked.gather.v2i32(<2 x
i32*> %tmp, i32<br>
>> 1, <2 x i1> <i1
true, i1 true>, <2 x i32>
undef) #1<br>
>> ; Store in2 to out for
good measure<br>
>> %tmp.v.1.0 =
extractelement <2 x i32>
%tmp.v.1, i32 0<br>
>> %tmp.v.1.1 =
extractelement <2 x i32>
%tmp.v.1, i32 1<br>
>> store i32 %tmp.v.1.0,
i32* %out<br>
>> %out.1 = getelementptr
i32, i32* %out, i32 1<br>
>> store i32 %tmp.v.1.1,
i32* %out.1<br>
>> ret void<br>
>> }<br>
><br>
><br>
> It uses a masked scatter
operation to store a value to the
two allocas and<br>
> then uses a masked gather
operation to read that same value.
This is done<br>
> twice in a row, with two
different values. If I run this code
through the<br>
> GVN pass, the second gather
(%tmp.v.1) will be deemed to be the
same as the<br>
> first gather (%tmp.v.0) and it
will be removed. After some
debugging, I<br>
> realized that this is happening
because the Memory Dependence
Analysis<br>
> returns %tmp.v.0 as the Def
dependency for %tmp.v.1, even though
the scatter<br>
> call in between changes the
value stored at %tmp. This, in turn,
is<br>
> happening because the alias
analysis is returning NoModRef for
the %tmp.v.1<br>
> and second scatter callsites.
The resulting IR produces the wrong
result:<br>
><br>
>> define spir_kernel void
@test(<2 x i32*> %in1, <2 x
i32*> %in2, i32* %out)<br>
>> {<br>
>> entry:<br>
>> %tmp.0 = alloca i32<br>
>> %tmp.1 = alloca i32<br>
>> %tmp.i = insertelement
<2 x i32*> undef, i32* %tmp.0,
i32 0<br>
>> %tmp = insertelement
<2 x i32*> %tmp.i, i32*
%tmp.1, i32 1<br>
>> %in1.v = call <2 x
i32>
@llvm.masked.gather.v2i32(<2 x
i32*> %in1, i32<br>
>> 1, <2 x i1> <i1
true, i1 true>, <2 x i32>
undef) #1<br>
>> %in2.v = call <2 x
i32>
@llvm.masked.gather.v2i32(<2 x
i32*> %in2, i32<br>
>> 1, <2 x i1> <i1
true, i1 true>, <2 x i32>
undef) #1<br>
>> call void
@llvm.masked.scatter.v2i32(<2 x
i32> %in1.v, <2 x i32*>
%tmp,<br>
>> i32 1, <2 x i1>
<i1 true, i1 true>)<br>
>> %tmp.v.0 = call <2 x
i32>
@llvm.masked.gather.v2i32(<2 x
i32*> %tmp, i32<br>
>> 1, <2 x i1> <i1
true, i1 true>, <2 x i32>
undef) #1<br>
>> call void
@llvm.masked.scatter.v2i32(<2 x
i32> %in2.v, <2 x i32*>
%tmp,<br>
>> i32 1, <2 x i1>
<i1 true, i1 true>)<br>
>> ; The call to
masked.gather is gone and now we are
using the old value<br>
>> (%tmp.v.0)<br>
>> %tmp.v.1.0 =
extractelement <2 x i32>
%tmp.v.0, i32 0<br>
>> %tmp.v.1.1 =
extractelement <2 x i32>
%tmp.v.0, i32 1<br>
>> store i32 %tmp.v.1.0,
i32* %out<br>
>> %out.1 = getelementptr
i32, i32* %out, i32 1<br>
>> store i32 %tmp.v.1.1,
i32* %out.1<br>
>> ret void<br>
>> }<br>
><br>
><br>
> The old value read from %tmp is
used, instead of the new one. I
tested this<br>
> code using `opt -gvn`, with
LLVM 3.8.1. I also tried tip
(84cb7f4) with the<br>
> same result.<br>
><br>
> I should mention that if I
replace the second scatter with
stores, the issue<br>
> persists. The only way to avoid
it is to replace all scatter/gather<br>
> intrinsics with equivalent sets
of store/load, in which case the
MemDep<br>
> returns the correct
dependencies and the GVN pass will
not remove the second<br>
> set of loads.<br>
><br>
> So, my question is, is this a
bug or am I doing something that I
shouldn't<br>
> be in the IR? And if it is a
bug, is it the AA analyses that
return the<br>
> wrong result (I presume so) or
should GVN handle such cases
differently?<br>
><br>
> Best regards,<br>
><br>
> Chris<br>
><br>
</div>
</div>
> ______________________________<wbr>_________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
><br>
<span><font color="#888888"><br>
<br>
<br>
--<br>
Davide<br>
<br>
"There are no solved problems; there
are only problems that are more<br>
or less solved" -- Henri Poincare<br>
</font></span></blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></blockquote></div></div></div>