[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 29 13:26:48 PDT 2016


We might have specification bug here, but we appear to implement what we 
specified.  argmemonly is specified as only considering pointer typed 
arguments.  It's behavior on vector-of-pointers is unspecified, but 
would seem to fall into the same case as inttoptr of an integer argument 
(i.e. explicitly undefined).  We could consider changing that.



On 08/29/2016 11:59 AM, Daniel Berlin wrote:
> + a few others.
>
> After following this rabbit hole a bit, there are a lot of mutually 
> recursive calls, etc, that may or may not do the right thing with 
> vectors of pointers.
> I can fix *this* particular bug with the attached patch.
>
> However, it's mostly papering over stuff.  Nothing seems to know what 
> to do with a memorylocation that is a vector of pointers. They all 
> expect memorylocation to be a single pointer location.
>
> I would chalk it up to "luck" that this patch fixes the bug.
>
> It's pretty clear that MemoryLocation doesn't fit the needs of a lot 
> of stuff anymore (we hacked AA nodes into it, and lots of stuff now 
> tries to figure out the invariantess of the locations, blah blah 
> blah), but it seems like a big job to figure out what to replace it 
> with that will work for these cases.
>
> (I'm pretty positive if we just make it MemoryLocations, and have 
> everything loop over the locations, the compiler will get a lot larger 
> and a lot slower)
>
> On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org 
> <mailto:dberlin at dberlin.org>> wrote:
>
>     it would also, for that matter, say the same about an array of
>     pointers.
>
>     It's not clear to me what will break if you change this to
>     isPtrOrPtrVectorTy.
>
>     In fact, i know it doesn't fix this bug.
>     It's a pretty deep rabbit hole of things not quite prepared to
>     understand vectors of pointers.
>
>     (we prepare memorylocations of them, but memory locations expect
>     to be one thing, not a group of things, etc).
>
>
>
>
>     On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin
>     <dberlin at dberlin.org <mailto:dberlin at dberlin.org>> wrote:
>
>         this is definitely a bug in AA.
>
>           225      for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I
>         != E; ++I) {
>            226        const Value *Arg = *I;
>            227        if (!Arg->getType()->isPointerTy())
>         -> 228          continue;
>            229        unsigned CS2ArgIdx =
>         std::distance(CS2.arg_begin(), I);
>            230        auto CS2ArgLoc =
>         MemoryLocation::getForArgument(CS2, CS2ArgIdx, TLI);
>
>          AliasAnalysis.cpp:228
>
>         It ignores every argument because they are vectors of
>         pointers, not pointers.
>
>
>         I'm surprised this has not broken anything before. It will
>         never say two intrinsics with vectors of pointers mod/ref each
>         other.
>
>
>         On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano
>         <davide at freebsd.org <mailto:davide at freebsd.org>> wrote:
>
>             + Daniel Berlin
>
>             On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev
>             <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
>             wrote:
>             > Hello everyone,
>             >
>             > I think I have found an gvn / alias analysis related
>             bug, but before opening
>             > an issue on the tracker I wanted to see if I am missing
>             something. I have
>             > the following testcase:
>             >
>             >> define spir_kernel void @test(<2 x i32*> %in1, <2 x
>             i32*> %in2, i32* %out)
>             >> {
>             >> entry:
>             >>   ; Just some temporary storage
>             >>   %tmp.0 = alloca i32
>             >>   %tmp.1 = alloca i32
>             >>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0,
>             i32 0
>             >>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1,
>             i32 1
>             >>   ; Read from in1 and in2
>             >>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2
>             x i32*> %in1, i32
>             >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>             >>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2
>             x i32*> %in2, i32
>             >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>             >>   ; Store in1 to the allocas
>             >>   call void @llvm.masked.scatter.v2i32(<2 x i32>
>             %in1.v, <2 x i32*> %tmp,
>             >> i32 1, <2 x i1> <i1 true, i1 true>);
>             >>   ; Read in1 from the allocas
>             >>   %tmp.v.0 = call <2 x i32>
>             @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32
>             >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>             >>   ; Store in2 to the allocas
>             >>   call void @llvm.masked.scatter.v2i32(<2 x i32>
>             %in2.v, <2 x i32*> %tmp,
>             >> i32 1, <2 x i1> <i1 true, i1 true>);
>             >>   ; Read in2 from the allocas
>             >>   %tmp.v.1 = call <2 x i32>
>             @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32
>             >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>             >>   ; Store in2 to out for good measure
>             >>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0
>             >>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1
>             >>   store i32 %tmp.v.1.0, i32* %out
>             >>   %out.1 = getelementptr i32, i32* %out, i32 1
>             >>   store i32 %tmp.v.1.1, i32* %out.1
>             >>   ret void
>             >> }
>             >
>             >
>             > It uses a masked scatter operation to store a value to
>             the two allocas and
>             > then uses a masked gather operation to read that same
>             value. This is done
>             > twice in a row, with two different values. If I run this
>             code through the
>             > GVN pass, the second gather (%tmp.v.1) will be deemed to
>             be the same as the
>             > first gather (%tmp.v.0) and it will be removed. After
>             some debugging, I
>             > realized that this is happening because the Memory
>             Dependence Analysis
>             > returns %tmp.v.0 as the Def dependency for %tmp.v.1,
>             even though the scatter
>             > call in between changes the value stored at %tmp. This,
>             in turn, is
>             > happening because the alias analysis is returning
>             NoModRef for the %tmp.v.1
>             > and second scatter callsites. The resulting IR produces
>             the wrong result:
>             >
>             >> define spir_kernel void @test(<2 x i32*> %in1, <2 x
>             i32*> %in2, i32* %out)
>             >> {
>             >> entry:
>             >>   %tmp.0 = alloca i32
>             >>   %tmp.1 = alloca i32
>             >>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0,
>             i32 0
>             >>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1,
>             i32 1
>             >>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2
>             x i32*> %in1, i32
>             >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>             >>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2
>             x i32*> %in2, i32
>             >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>             >>   call void @llvm.masked.scatter.v2i32(<2 x i32>
>             %in1.v, <2 x i32*> %tmp,
>             >> i32 1, <2 x i1> <i1 true, i1 true>)
>             >>   %tmp.v.0 = call <2 x i32>
>             @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32
>             >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>             >>   call void @llvm.masked.scatter.v2i32(<2 x i32>
>             %in2.v, <2 x i32*> %tmp,
>             >> i32 1, <2 x i1> <i1 true, i1 true>)
>             >>   ; The call to masked.gather is gone and now we are
>             using the old value
>             >> (%tmp.v.0)
>             >>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0
>             >>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1
>             >>   store i32 %tmp.v.1.0, i32* %out
>             >>   %out.1 = getelementptr i32, i32* %out, i32 1
>             >>   store i32 %tmp.v.1.1, i32* %out.1
>             >>   ret void
>             >> }
>             >
>             >
>             > The old value read from %tmp is used, instead of the new
>             one. I tested this
>             > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip
>             (84cb7f4) with the
>             > same result.
>             >
>             > I should mention that if I replace the second scatter
>             with stores, the issue
>             > persists. The only way to avoid it is to replace all
>             scatter/gather
>             > intrinsics with equivalent sets of store/load, in which
>             case the MemDep
>             > returns the correct dependencies and the GVN pass will
>             not remove the second
>             > set of loads.
>             >
>             > So, my question is, is this a bug or am I doing
>             something that I shouldn't
>             > be in the IR? And if it is a bug, is it the AA analyses
>             that return the
>             > wrong result (I presume so) or should GVN handle such
>             cases differently?
>             >
>             > Best regards,
>             >
>             > Chris
>             >
>             > _______________________________________________
>             > LLVM Developers mailing list
>             > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>             >
>
>
>
>             --
>             Davide
>
>             "There are no solved problems; there are only problems
>             that are more
>             or less solved" -- Henri Poincare
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/6d72b852/attachment-0001.html>


More information about the llvm-dev mailing list