[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Daniel Berlin via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 30 13:26:25 PDT 2016
Yeah, i just hope it doesn't regress scatter/gather vector code badly.
But at least it's correct now?
On Tue, Aug 30, 2016 at 1:11 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ------------------------------
>
> *From: *"Daniel Berlin" <dberlin at dberlin.org>
> *To: *"Philip Reames" <listmail at philipreames.com>, "Davide Italiano" <
> davide at freebsd.org>, "Chandler Carruth" <chandlerc at gmail.com>
> *Cc: *"Chris Sakalis" <chrissakalis at gmail.com>, "David Majnemer" <
> david.majnemer at gmail.com>, "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev" <
> llvm-dev at lists.llvm.org>
> *Sent: *Tuesday, August 30, 2016 3:07:01 PM
> *Subject: *Re: [llvm-dev] GVN / Alias Analysis issue with
> llvm.masked.scatter/gather intrinsics
>
> This is now committed and a test added to GVN.
>
> Thanks!
>
> I suspect that, in practice, we'll get little benefit from handling this
> until our AA passes learn how to deal with (i.e. look back through) pointer
> vectors.
>
> -Hal
>
>
>
> On Mon, Aug 29, 2016 at 1:59 PM, Daniel Berlin <dberlin at dberlin.org>
> wrote:
>
>> Okay, so then it sounds like, for now, the right fix is to stop marking
>> masked.gather and masked.scatter with intrarg* options.
>>
>> On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com>
>> wrote:
>>
>>> We might have specification bug here, but we appear to implement what we
>>> specified. argmemonly is specified as only considering pointer typed
>>> arguments. It's behavior on vector-of-pointers is unspecified, but would
>>> seem to fall into the same case as inttoptr of an integer argument (i.e.
>>> explicitly undefined). We could consider changing that.
>>>
>>>
>>>
>>> On 08/29/2016 11:59 AM, Daniel Berlin wrote:
>>>
>>> + a few others.
>>>
>>> After following this rabbit hole a bit, there are a lot of mutually
>>> recursive calls, etc, that may or may not do the right thing with vectors
>>> of pointers.
>>> I can fix *this* particular bug with the attached patch.
>>>
>>> However, it's mostly papering over stuff. Nothing seems to know what to
>>> do with a memorylocation that is a vector of pointers. They all expect
>>> memorylocation to be a single pointer location.
>>>
>>> I would chalk it up to "luck" that this patch fixes the bug.
>>>
>>> It's pretty clear that MemoryLocation doesn't fit the needs of a lot of
>>> stuff anymore (we hacked AA nodes into it, and lots of stuff now tries to
>>> figure out the invariantess of the locations, blah blah blah), but it seems
>>> like a big job to figure out what to replace it with that will work for
>>> these cases.
>>>
>>> (I'm pretty positive if we just make it MemoryLocations, and have
>>> everything loop over the locations, the compiler will get a lot larger and
>>> a lot slower)
>>>
>>> On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org>
>>> wrote:
>>>
>>>> it would also, for that matter, say the same about an array of
>>>> pointers.
>>>>
>>>> It's not clear to me what will break if you change this to
>>>> isPtrOrPtrVectorTy.
>>>>
>>>> In fact, i know it doesn't fix this bug.
>>>> It's a pretty deep rabbit hole of things not quite prepared to
>>>> understand vectors of pointers.
>>>>
>>>> (we prepare memorylocations of them, but memory locations expect to be
>>>> one thing, not a group of things, etc).
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin <dberlin at dberlin.org>
>>>> wrote:
>>>>
>>>>> this is definitely a bug in AA.
>>>>>
>>>>> 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E;
>>>>> ++I) {
>>>>> 226 const Value *Arg = *I;
>>>>> 227 if (!Arg->getType()->isPointerTy())
>>>>> -> 228 continue;
>>>>> 229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I);
>>>>> 230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2,
>>>>> CS2ArgIdx, TLI);
>>>>>
>>>>> AliasAnalysis.cpp:228
>>>>>
>>>>> It ignores every argument because they are vectors of pointers, not
>>>>> pointers.
>>>>>
>>>>>
>>>>> I'm surprised this has not broken anything before. It will never say
>>>>> two intrinsics with vectors of pointers mod/ref each other.
>>>>>
>>>>>
>>>>> On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <davide at freebsd.org>
>>>>> wrote:
>>>>>
>>>>>> + Daniel Berlin
>>>>>>
>>>>>> On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev
>>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>> > Hello everyone,
>>>>>> >
>>>>>> > I think I have found an gvn / alias analysis related bug, but
>>>>>> before opening
>>>>>> > an issue on the tracker I wanted to see if I am missing something.
>>>>>> I have
>>>>>> > the following testcase:
>>>>>> >
>>>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2,
>>>>>> i32* %out)
>>>>>> >> {
>>>>>> >> entry:
>>>>>> >> ; Just some temporary storage
>>>>>> >> %tmp.0 = alloca i32
>>>>>> >> %tmp.1 = alloca i32
>>>>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
>>>>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
>>>>>> >> ; Read from in1 and in2
>>>>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>> %in1, i32
>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>> %in2, i32
>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>> >> ; Store in1 to the allocas
>>>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x
>>>>>> i32*> %tmp,
>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>);
>>>>>> >> ; Read in1 from the allocas
>>>>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>> %tmp, i32
>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>> >> ; Store in2 to the allocas
>>>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x
>>>>>> i32*> %tmp,
>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>);
>>>>>> >> ; Read in2 from the allocas
>>>>>> >> %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>> %tmp, i32
>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>> >> ; Store in2 to out for good measure
>>>>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0
>>>>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1
>>>>>> >> store i32 %tmp.v.1.0, i32* %out
>>>>>> >> %out.1 = getelementptr i32, i32* %out, i32 1
>>>>>> >> store i32 %tmp.v.1.1, i32* %out.1
>>>>>> >> ret void
>>>>>> >> }
>>>>>> >
>>>>>> >
>>>>>> > It uses a masked scatter operation to store a value to the two
>>>>>> allocas and
>>>>>> > then uses a masked gather operation to read that same value. This
>>>>>> is done
>>>>>> > twice in a row, with two different values. If I run this code
>>>>>> through the
>>>>>> > GVN pass, the second gather (%tmp.v.1) will be deemed to be the
>>>>>> same as the
>>>>>> > first gather (%tmp.v.0) and it will be removed. After some
>>>>>> debugging, I
>>>>>> > realized that this is happening because the Memory Dependence
>>>>>> Analysis
>>>>>> > returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though
>>>>>> the scatter
>>>>>> > call in between changes the value stored at %tmp. This, in turn, is
>>>>>> > happening because the alias analysis is returning NoModRef for the
>>>>>> %tmp.v.1
>>>>>> > and second scatter callsites. The resulting IR produces the wrong
>>>>>> result:
>>>>>> >
>>>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2,
>>>>>> i32* %out)
>>>>>> >> {
>>>>>> >> entry:
>>>>>> >> %tmp.0 = alloca i32
>>>>>> >> %tmp.1 = alloca i32
>>>>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
>>>>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
>>>>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>> %in1, i32
>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>> %in2, i32
>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x
>>>>>> i32*> %tmp,
>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>)
>>>>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*>
>>>>>> %tmp, i32
>>>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
>>>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x
>>>>>> i32*> %tmp,
>>>>>> >> i32 1, <2 x i1> <i1 true, i1 true>)
>>>>>> >> ; The call to masked.gather is gone and now we are using the old
>>>>>> value
>>>>>> >> (%tmp.v.0)
>>>>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0
>>>>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1
>>>>>> >> store i32 %tmp.v.1.0, i32* %out
>>>>>> >> %out.1 = getelementptr i32, i32* %out, i32 1
>>>>>> >> store i32 %tmp.v.1.1, i32* %out.1
>>>>>> >> ret void
>>>>>> >> }
>>>>>> >
>>>>>> >
>>>>>> > The old value read from %tmp is used, instead of the new one. I
>>>>>> tested this
>>>>>> > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4)
>>>>>> with the
>>>>>> > same result.
>>>>>> >
>>>>>> > I should mention that if I replace the second scatter with stores,
>>>>>> the issue
>>>>>> > persists. The only way to avoid it is to replace all scatter/gather
>>>>>> > intrinsics with equivalent sets of store/load, in which case the
>>>>>> MemDep
>>>>>> > returns the correct dependencies and the GVN pass will not remove
>>>>>> the second
>>>>>> > set of loads.
>>>>>> >
>>>>>> > So, my question is, is this a bug or am I doing something that I
>>>>>> shouldn't
>>>>>> > be in the IR? And if it is a bug, is it the AA analyses that return
>>>>>> the
>>>>>> > wrong result (I presume so) or should GVN handle such cases
>>>>>> differently?
>>>>>> >
>>>>>> > Best regards,
>>>>>> >
>>>>>> > Chris
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > LLVM Developers mailing list
>>>>>> > llvm-dev at lists.llvm.org
>>>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Davide
>>>>>>
>>>>>> "There are no solved problems; there are only problems that are more
>>>>>> or less solved" -- Henri Poincare
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160830/c4b87893/attachment.html>
More information about the llvm-dev
mailing list