[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics

Chris Sakalis via llvm-dev llvm-dev at lists.llvm.org
Wed Aug 31 01:58:56 PDT 2016


Thank you for the quick fix, I can no longer reproduce the issue. As far a
releases go, I am guessing that this is going to be in 4.0?

Best,

Chris

On Tue, Aug 30, 2016 at 9:26 PM, Daniel Berlin <dberlin at dberlin.org> wrote:

> Yeah, i just hope it doesn't regress scatter/gather vector code badly.
> But at least it's correct now?
>
>
> On Tue, Aug 30, 2016 at 1:11 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>>
>> ------------------------------
>>
>> *From: *"Daniel Berlin" <dberlin at dberlin.org>
>> *To: *"Philip Reames" <listmail at philipreames.com>, "Davide Italiano" <
>> davide at freebsd.org>, "Chandler Carruth" <chandlerc at gmail.com>
>> *Cc: *"Chris Sakalis" <chrissakalis at gmail.com>, "David Majnemer" <
>> david.majnemer at gmail.com>, "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev" <
>> llvm-dev at lists.llvm.org>
>> *Sent: *Tuesday, August 30, 2016 3:07:01 PM
>> *Subject: *Re: [llvm-dev] GVN / Alias Analysis issue with
>> llvm.masked.scatter/gather intrinsics
>>
>> This is now committed and a test added to GVN.
>>
>> Thanks!
>>
>> I suspect that, in practice, we'll get little benefit from handling this
>> until our AA passes learn how to deal with (i.e. look back through) pointer
>> vectors.
>>
>>  -Hal
>>
>>
>>
>> On Mon, Aug 29, 2016 at 1:59 PM, Daniel Berlin <dberlin at dberlin.org>
>> wrote:
>>
>>> Okay, so then it sounds like, for now, the right fix is to stop marking
>>> masked.gather and masked.scatter with intrarg* options.
>>>
>>> On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com>
>>> wrote:
>>>
>>>> We might have specification bug here, but we appear to implement what
>>>> we specified.  argmemonly is specified as only considering pointer typed
>>>> arguments.  It's behavior on vector-of-pointers is unspecified, but would
>>>> seem to fall into the same case as inttoptr of an integer argument (i.e.
>>>> explicitly undefined).  We could consider changing that.
>>>>
>>>>
>>>>
>>>> On 08/29/2016 11:59 AM, Daniel Berlin wrote:
>>>>
>>>> + a few others.
>>>>
>>>> After following this rabbit hole a bit, there are a lot of mutually
>>>> recursive calls, etc, that may or may not do the right thing with vectors
>>>> of pointers.
>>>> I can fix *this* particular bug with the attached patch.
>>>>
>>>> However, it's mostly papering over stuff.  Nothing seems to know what
>>>> to do with a memorylocation that is a vector of pointers. They all expect
>>>> memorylocation to be a single pointer location.
>>>>
>>>> I would chalk it up to "luck" that this patch fixes the bug.
>>>>
>>>> It's pretty clear that MemoryLocation doesn't fit the needs of a lot of
>>>> stuff anymore (we hacked AA nodes into it, and lots of stuff now tries to
>>>> figure out the invariantess of the locations, blah blah blah), but it seems
>>>> like a big job to figure out what to replace it with that will work for
>>>> these cases.
>>>>
>>>> (I'm pretty positive if we just make it MemoryLocations, and have
>>>> everything loop over the locations, the compiler will get a lot larger and
>>>> a lot slower)
>>>>
>>>> On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org>
>>>> wrote:
>>>>
>>>>> it would also, for that matter, say the same about an array of
>>>>> pointers.
>>>>>
>>>>> It's not clear to me what will break if you change this to
>>>>> isPtrOrPtrVectorTy.
>>>>>
>>>>> In fact, i know it doesn't fix this bug.
>>>>> It's a pretty deep rabbit hole of things not quite prepared to
>>>>> understand vectors of pointers.
>>>>>
>>>>> (we prepare memorylocations of them, but memory locations expect to be
>>>>> one thing, not a group of things, etc).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin <dberlin at dberlin.org>
>>>>> wrote:
>>>>>
>>>>>> this is definitely a bug in AA.
>>>>>>
>>>>>>   225      for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E;
>>>>>> ++I) {
>>>>>>    226        const Value *Arg = *I;
>>>>>>    227        if (!Arg->getType()->isPointerTy())
>>>>>> -> 228          continue;
>>>>>>    229        unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I);
>>>>>>    230        auto CS2ArgLoc = MemoryLocation::getForArgument(CS2,
>>>>>> CS2ArgIdx, TLI);
>>>>>>
>>>>>>  AliasAnalysis.cpp:228
>>>>>>
>>>>>> It ignores every argument because they are vectors of pointers, not
>>>>>> pointers.
>>>>>>
>>>>>>
>>>>>> I'm surprised this has not broken anything before. It will never say
>>>>>> two intrinsics with vectors of pointers mod/ref each other.
>>>>>>
>>>>>>
>>>>>> On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <davide at freebsd.org>
>>>>>> wrote:
>>>>>>
>>>>>>> + Daniel Berlin
>>>>>>>
>>>>>>> On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev
>>>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>>> > Hello everyone,
>>>>>>> >
>>>>>>> > I think I have found an gvn / alias analysis related bug, but
>>>>>>> before opening
>>>>>>> > an issue on the tracker I wanted to see if I am missing something.
>>>>>>> I have
>>>>>>> > the following testcase:
>>>>>>> >
>>>>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2,
>>>>>>> i32* %out)
>>>>>>> >> {
>>>>>>> >> entry:
>>>>>>> >>   ; Just some temporary storage
>>>>>>> >>   %tmp.0 = alloca i32
>>>>>>> >>   %tmp.1 = alloca i32
>>>>>>> >>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
>>>>>>> >>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
>>>>>>> >>   ; Read from in1 and in2
>>>>>>> >>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>>> %in1, i32
>>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>>> >>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>>> %in2, i32
>>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>>> >>   ; Store in1 to the allocas
>>>>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x
>>>>>>> i32*> %tmp,
>>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>);
>>>>>>> >>   ; Read in1 from the allocas
>>>>>>> >>   %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>>> %tmp, i32
>>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>>> >>   ; Store in2 to the allocas
>>>>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x
>>>>>>> i32*> %tmp,
>>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>);
>>>>>>> >>   ; Read in2 from the allocas
>>>>>>> >>   %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>>> %tmp, i32
>>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>>> >>   ; Store in2 to out for good measure
>>>>>>> >>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0
>>>>>>> >>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1
>>>>>>> >>   store i32 %tmp.v.1.0, i32* %out
>>>>>>> >>   %out.1 = getelementptr i32, i32* %out, i32 1
>>>>>>> >>   store i32 %tmp.v.1.1, i32* %out.1
>>>>>>> >>   ret void
>>>>>>> >> }
>>>>>>> >
>>>>>>> >
>>>>>>> > It uses a masked scatter operation to store a value to the two
>>>>>>> allocas and
>>>>>>> > then uses a masked gather operation to read that same value. This
>>>>>>> is done
>>>>>>> > twice in a row, with two different values. If I run this code
>>>>>>> through the
>>>>>>> > GVN pass, the second gather (%tmp.v.1) will be deemed to be the
>>>>>>> same as the
>>>>>>> > first gather (%tmp.v.0) and it will be removed. After some
>>>>>>> debugging, I
>>>>>>> > realized that this is happening because the Memory Dependence
>>>>>>> Analysis
>>>>>>> > returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though
>>>>>>> the scatter
>>>>>>> > call in between changes the value stored at %tmp. This, in turn, is
>>>>>>> > happening because the alias analysis is returning NoModRef for the
>>>>>>> %tmp.v.1
>>>>>>> > and second scatter callsites. The resulting IR produces the wrong
>>>>>>> result:
>>>>>>> >
>>>>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2,
>>>>>>> i32* %out)
>>>>>>> >> {
>>>>>>> >> entry:
>>>>>>> >>   %tmp.0 = alloca i32
>>>>>>> >>   %tmp.1 = alloca i32
>>>>>>> >>   %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
>>>>>>> >>   %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
>>>>>>> >>   %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>>> %in1, i32
>>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>>> >>   %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>>> %in2, i32
>>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x
>>>>>>> i32*> %tmp,
>>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>)
>>>>>>> >>   %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>>> %tmp, i32
>>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>>> >>   call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x
>>>>>>> i32*> %tmp,
>>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>)
>>>>>>> >>   ; The call to masked.gather is gone and now we are using the
>>>>>>> old value
>>>>>>> >> (%tmp.v.0)
>>>>>>> >>   %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0
>>>>>>> >>   %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1
>>>>>>> >>   store i32 %tmp.v.1.0, i32* %out
>>>>>>> >>   %out.1 = getelementptr i32, i32* %out, i32 1
>>>>>>> >>   store i32 %tmp.v.1.1, i32* %out.1
>>>>>>> >>   ret void
>>>>>>> >> }
>>>>>>> >
>>>>>>> >
>>>>>>> > The old value read from %tmp is used, instead of the new one. I
>>>>>>> tested this
>>>>>>> > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4)
>>>>>>> with the
>>>>>>> > same result.
>>>>>>> >
>>>>>>> > I should mention that if I replace the second scatter with stores,
>>>>>>> the issue
>>>>>>> > persists. The only way to avoid it is to replace all scatter/gather
>>>>>>> > intrinsics with equivalent sets of store/load, in which case the
>>>>>>> MemDep
>>>>>>> > returns the correct dependencies and the GVN pass will not remove
>>>>>>> the second
>>>>>>> > set of loads.
>>>>>>> >
>>>>>>> > So, my question is, is this a bug or am I doing something that I
>>>>>>> shouldn't
>>>>>>> > be in the IR? And if it is a bug, is it the AA analyses that
>>>>>>> return the
>>>>>>> > wrong result (I presume so) or should GVN handle such cases
>>>>>>> differently?
>>>>>>> >
>>>>>>> > Best regards,
>>>>>>> >
>>>>>>> > Chris
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > LLVM Developers mailing list
>>>>>>> > llvm-dev at lists.llvm.org
>>>>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Davide
>>>>>>>
>>>>>>> "There are no solved problems; there are only problems that are more
>>>>>>> or less solved" -- Henri Poincare
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160831/e40f5814/attachment.html>


More information about the llvm-dev mailing list