[LLVMdev] alloca scalarization with dynamic indexing into vectors
Duncan Sands
baldrick at free.fr
Thu Feb 7 02:39:56 PST 2013
Hi Scott, this seems like a SROA bug to me, please open a bug report.
Ciao, Duncan.
On 07/02/13 03:26, Scott Pillow wrote:
> Hi all,
>
> I have a question regarding dynamic indexing into a vector with GEP. I see that
> in the ScalarReplAggregates pass in the LLVM 3.2 release the call
> SROA::isSafeGEP() will now allow alloca scalarization in the case where a GEP
> index into a vector isn’t a constant. My question is: what is the expected
> behavior when the index is out of bounds of the vector? Is it undefined? I
> have an example .ll where we have an alloca that can potentially be scalarized
> where the index into the vector is a function argument and could be set to any
> value.
>
> (scalar_repl_store_delete.ll):
>
> target datalayout =
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64"
>
> define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32
> %alignmentOffsets) nounwind alwaysinline {
>
> entry:
>
> %sPrivateStorage = alloca [3 x <2 x i32>], align 8
>
> %0 = load <2 x i32>* %src, align 8, !tbaa !9
>
> %arrayidx1 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 0
>
> store <2 x i32> %0, <2 x i32>* %arrayidx1, align 8, !tbaa !9
>
> %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1
>
> %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !9
>
> %arrayidx3 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 1
>
> store <2 x i32> %1, <2 x i32>* %arrayidx3, align 8, !tbaa !9
>
> %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2
>
> %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !9
>
> %arrayidx5 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 2
>
> store <2 x i32> %2, <2 x i32>* %arrayidx5, align 8, !tbaa !9
>
> %idx.ext = zext i32 %alignmentOffsets to i64
>
> %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 0, i64 %idx.ext
>
> %3 = load i32* %add.ptr, align 4, !tbaa !11
>
> %4 = insertelement <2 x i32> undef, i32 %3, i32 0
>
> %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer
>
> store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !9
>
> ret void
>
> }
>
> !9 = metadata !{metadata !"omnipotent char", metadata !10}
>
> !10 = metadata !{metadata !"Simple C/C++ TBAA", null}
>
> !11 = metadata !{metadata !"int", metadata !9}
>
> In this example, the sequence of stores is copying the data from %src into
> %sPrivateStorage with the GEP of interest being:
>
> %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 0, i64 %idx.ext
>
> After running the line:
>
> opt.exe -scalarrepl scalar_repl_store_delete.ll -o=scalar_repl_store_delete_after.bc
>
> We get:
>
> (scalar_repl_store_delete_after.ll):
>
> ; ModuleID = 'scalar_repl_store_delete_after.bc'
>
> target datalayout =
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64"
>
> define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32
> %alignmentOffsets) nounwind alwaysinline {
>
> entry:
>
> %sPrivateStorage.0 = alloca <2 x i32>, align 8
>
> %0 = load <2 x i32>* %src, align 8, !tbaa !0
>
> store <2 x i32> %0, <2 x i32>* %sPrivateStorage.0, align 8, !tbaa !0
>
> %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1
>
> %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !0
>
> %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2
>
> %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !0
>
> %idx.ext = zext i32 %alignmentOffsets to i64
>
> %add.ptr = getelementptr inbounds <2 x i32>* %sPrivateStorage.0, i32 0, i64
> %idx.ext
>
> %3 = load i32* %add.ptr, align 4, !tbaa !2
>
> %4 = insertelement <2 x i32> undef, i32 %3, i32 0
>
> %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer
>
> store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !0
>
> ret void
>
> }
>
> !0 = metadata !{metadata !"omnipotent char", metadata !1}
>
> !1 = metadata !{metadata !"Simple C/C++ TBAA", null}
>
> !2 = metadata !{metadata !"int", metadata !0}
>
> The second two stores are deleted because they appear to be dead even though
> that data can actually be reached by the out of bounds vector index in the GEP.
> What is expected in this case?
>
> Thanks,
>
> Scott
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
More information about the llvm-dev
mailing list