[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Chris Sakalis via llvm-dev
llvm-dev at lists.llvm.org
Mon Aug 29 06:42:20 PDT 2016
Hello everyone,
I think I have found an gvn / alias analysis related bug, but before
opening an issue on the tracker I wanted to see if I am missing something.
I have the following testcase:
define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out) {
> entry:
> ; Just some temporary storage
> %tmp.0 = alloca i32
> %tmp.1 = alloca i32
> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
> ; Read from in1 and in2
> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, i32
> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, i32
> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
> ; Store in1 to the allocas
> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> %tmp,
> i32 1, <2 x i1> <i1 true, i1 true>);
> ; Read in1 from the allocas
> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32
> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
> ; Store in2 to the allocas
> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> %tmp,
> i32 1, <2 x i1> <i1 true, i1 true>);
> ; Read in2 from the allocas
> %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32
> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
> ; Store in2 to out for good measure
> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0
> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1
> store i32 %tmp.v.1.0, i32* %out
> %out.1 = getelementptr i32, i32* %out, i32 1
> store i32 %tmp.v.1.1, i32* %out.1
> ret void
> }
It uses a masked scatter operation to store a value to the two allocas and
then uses a masked gather operation to read that same value. This is done
twice in a row, with two different values. If I run this code through the
GVN pass, the second gather (%tmp.v.1) will be deemed to be the same as the
first gather (%tmp.v.0) and it will be removed. After some debugging, I
realized that this is happening because the Memory Dependence Analysis
returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though the
scatter call in between changes the value stored at %tmp. This, in turn, is
happening because the alias analysis is returning NoModRef for the %tmp.v.1
and second scatter callsites. The resulting IR produces the wrong result:
define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out) {
> entry:
> %tmp.0 = alloca i32
> %tmp.1 = alloca i32
> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0
> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1
> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, i32
> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, i32
> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> %tmp,
> i32 1, <2 x i1> <i1 true, i1 true>)
> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32
> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1
> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> %tmp,
> i32 1, <2 x i1> <i1 true, i1 true>)
> ; The call to masked.gather is gone and now we are using the old value
> (%tmp.v.0)
> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0
> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1
> store i32 %tmp.v.1.0, i32* %out
> %out.1 = getelementptr i32, i32* %out, i32 1
> store i32 %tmp.v.1.1, i32* %out.1
> ret void
> }
The old value read from %tmp is used, instead of the new one. I tested this
code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4) with the
same result.
I should mention that if I replace the second scatter with stores, the
issue persists. The only way to avoid it is to replace all scatter/gather
intrinsics with equivalent sets of store/load, in which case the MemDep
returns the correct dependencies and the GVN pass will not remove the
second set of loads.
So, my question is, is this a bug or am I doing something that I shouldn't
be in the IR? And if it is a bug, is it the AA analyses that return the
wrong result (I presume so) or should GVN handle such cases differently?
Best regards,
Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/6fca8595/attachment.html>
More information about the llvm-dev
mailing list