[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Daniel Berlin via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 29 13:59:49 PDT 2016


Okay, so then it sounds like, for now, the right fix is to stop marking
masked.gather and masked.scatter with intrarg* options.

On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com>
wrote:

> We might have specification bug here, but we appear to implement what we
> specified.  argmemonly is specified as only considering pointer typed
> arguments.  It's behavior on vector-of-pointers is unspecified, but would
> seem to fall into the same case as inttoptr of an integer argument (i.e.
> explicitly undefined).  We could consider changing that.
>
>
>
> On 08/29/2016 11:59 AM, Daniel Berlin wrote:
>
> + a few others.
>
> After following this rabbit hole a bit, there are a lot of mutually
> recursive calls, etc, that may or may not do the right thing with vectors
> of pointers.
> I can fix *this* particular bug with the attached patch.
>
> However, it's mostly papering over stuff.  Nothing seems to know what to
> do with a memorylocation that is a vector of pointers. They all expect
> memorylocation to be a single pointer location.
>
> I would chalk it up to "luck" that this patch fixes the bug.
>
> It's pretty clear that MemoryLocation doesn't fit the needs of a lot of
> stuff anymore (we hacked AA nodes into it, and lots of stuff now tries to
> figure out the invariantess of the locations, blah blah blah), but it seems
> like a big job to figure out what to replace it with that will work for
> these cases.
>
> (I'm pretty positive if we just make it MemoryLocations, and have
> everything loop over the locations, the compiler will get a lot larger and
> a lot slower)
>
> On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org>
> wrote:
>
>> it would also, for that matter, say the same about an array of pointers.
>>
>> It's not clear to me what will break if you change this to
>> isPtrOrPtrVectorTy.
>>
>> In fact, i know it doesn't fix this bug.
>> It's a pretty deep rabbit hole of things not quite prepared to understand
>> vectors of pointers.
>>
>> (we prepare memorylocations of them, but memory locations expect to be
>> one thing, not a group of things, etc).
>>
>>
>>
>>
>> On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin <dberlin at dberlin.org>
>> wrote:
>>
>>> this is definitely a bug in AA.
>>>
>>>   225      for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E;
>>> ++I) {
>>>    226        const Value *Arg = *I;
>>>    227        if (!Arg->getType()->isPointerTy())
>>> -> 228          continue;
>>>    229        unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I);
>>>    230        auto CS2ArgLoc = MemoryLocation::getForArgument(CS2,
>>> CS2ArgIdx, TLI);
>>>
>>>  AliasAnalysis.cpp:228
>>>
>>> It ignores every argument because they are vectors of pointers, not
>>> pointers.
>>>
>>>
>>> I'm surprised this has not broken anything before. It will never say two
>>> intrinsics with vectors of pointers mod/ref each other.
>>>
>>>
>>> On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <davide at freebsd.org>
>>> wrote:
>>>
>>>> + Daniel Berlin
>>>>
>>>> On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev
>>>> <llvm-dev at lists.llvm.org> wrote:
>>>> > Hello everyone,
>>>> >
>>>> > I think I have found an gvn / alias analysis related bug, but before
>>>> opening
>>>> > an issue on the tracker I wanted to see if I am missing something. I
>>>> have
>>>> > the following testcase:
>>>> >
>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32*
>>>> %out)
>>>> >> {
>>>> >> entry:
>>>> >>   ; Just some temporary storage
>>>> >>   %tmp.0 = alloca i32
>>>> >>   %tmp.1 = alloca i32
>>>> >>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
>>>> >>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
>>>> >>   ; Read from in1 and in2
>>>> >>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1,
>>>> i32
>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>> >>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2,
>>>> i32
>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>> >>   ; Store in1 to the allocas
>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*>
>>>> %tmp,
>>>> >> i32 1, <2 x i1> <i1 true, i1 true>);
>>>> >>   ; Read in1 from the allocas
>>>> >>   %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>> %tmp, i32
>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>> >>   ; Store in2 to the allocas
>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*>
>>>> %tmp,
>>>> >> i32 1, <2 x i1> <i1 true, i1 true>);
>>>> >>   ; Read in2 from the allocas
>>>> >>   %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>> %tmp, i32
>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>> >>   ; Store in2 to out for good measure
>>>> >>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0
>>>> >>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1
>>>> >>   store i32 %tmp.v.1.0, i32* %out
>>>> >>   %out.1 = getelementptr i32, i32* %out, i32 1
>>>> >>   store i32 %tmp.v.1.1, i32* %out.1
>>>> >>   ret void
>>>> >> }
>>>> >
>>>> >
>>>> > It uses a masked scatter operation to store a value to the two
>>>> allocas and
>>>> > then uses a masked gather operation to read that same value. This is
>>>> done
>>>> > twice in a row, with two different values. If I run this code through
>>>> the
>>>> > GVN pass, the second gather (%tmp.v.1) will be deemed to be the same
>>>> as the
>>>> > first gather (%tmp.v.0) and it will be removed. After some debugging,
>>>> I
>>>> > realized that this is happening because the Memory Dependence Analysis
>>>> > returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though the
>>>> scatter
>>>> > call in between changes the value stored at %tmp. This, in turn, is
>>>> > happening because the alias analysis is returning NoModRef for the
>>>> %tmp.v.1
>>>> > and second scatter callsites. The resulting IR produces the wrong
>>>> result:
>>>> >
>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32*
>>>> %out)
>>>> >> {
>>>> >> entry:
>>>> >>   %tmp.0 = alloca i32
>>>> >>   %tmp.1 = alloca i32
>>>> >>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
>>>> >>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
>>>> >>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1,
>>>> i32
>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>> >>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2,
>>>> i32
>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*>
>>>> %tmp,
>>>> >> i32 1, <2 x i1> <i1 true, i1 true>)
>>>> >>   %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>> %tmp, i32
>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*>
>>>> %tmp,
>>>> >> i32 1, <2 x i1> <i1 true, i1 true>)
>>>> >>   ; The call to masked.gather is gone and now we are using the old
>>>> value
>>>> >> (%tmp.v.0)
>>>> >>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0
>>>> >>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1
>>>> >>   store i32 %tmp.v.1.0, i32* %out
>>>> >>   %out.1 = getelementptr i32, i32* %out, i32 1
>>>> >>   store i32 %tmp.v.1.1, i32* %out.1
>>>> >>   ret void
>>>> >> }
>>>> >
>>>> >
>>>> > The old value read from %tmp is used, instead of the new one. I
>>>> tested this
>>>> > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4)
>>>> with the
>>>> > same result.
>>>> >
>>>> > I should mention that if I replace the second scatter with stores,
>>>> the issue
>>>> > persists. The only way to avoid it is to replace all scatter/gather
>>>> > intrinsics with equivalent sets of store/load, in which case the
>>>> MemDep
>>>> > returns the correct dependencies and the GVN pass will not remove the
>>>> second
>>>> > set of loads.
>>>> >
>>>> > So, my question is, is this a bug or am I doing something that I
>>>> shouldn't
>>>> > be in the IR? And if it is a bug, is it the AA analyses that return
>>>> the
>>>> > wrong result (I presume so) or should GVN handle such cases
>>>> differently?
>>>> >
>>>> > Best regards,
>>>> >
>>>> > Chris
>>>> >
>>>> > _______________________________________________
>>>> > LLVM Developers mailing list
>>>> > llvm-dev at lists.llvm.org
>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Davide
>>>>
>>>> "There are no solved problems; there are only problems that are more
>>>> or less solved" -- Henri Poincare
>>>>
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/0db527f5/attachment.html>


More information about the llvm-dev mailing list