[LLVMdev] alloca scalarization with dynamic indexing into vectors

Thu Feb 7 02:39:56 PST 2013

Hi Scott, this seems like a SROA bug to me, please open a bug report.

Ciao, Duncan.

On 07/02/13 03:26, Scott Pillow wrote:
> Hi all,
>
> I have a question regarding dynamic indexing into a vector with GEP.  I see that
> in the ScalarReplAggregates pass in the LLVM 3.2 release the call
> SROA::isSafeGEP() will now allow alloca scalarization in the case where a GEP
> index into a vector isn’t a constant.  My question is: what is the expected
> behavior when the index is out of bounds of the vector?  Is it undefined?  I
> have an example .ll where we have an alloca that can potentially be scalarized
> where the index into the vector is a function argument and could be set to any
> value.
>
> (scalar_repl_store_delete.ll):
>
> target datalayout =
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64"
>
> define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32
> %alignmentOffsets) nounwind alwaysinline {
>
> entry:
>
>    %sPrivateStorage = alloca [3 x <2 x i32>], align 8
>
>    %0 = load <2 x i32>* %src, align 8, !tbaa !9
>
>    %arrayidx1 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 0
>
>    store <2 x i32> %0, <2 x i32>* %arrayidx1, align 8, !tbaa !9
>
>    %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1
>
>    %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !9
>
>    %arrayidx3 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 1
>
>    store <2 x i32> %1, <2 x i32>* %arrayidx3, align 8, !tbaa !9
>
>    %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2
>
>    %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !9
>
>    %arrayidx5 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 2
>
>    store <2 x i32> %2, <2 x i32>* %arrayidx5, align 8, !tbaa !9
>
>    %idx.ext = zext i32 %alignmentOffsets to i64
>
>    %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 0, i64 %idx.ext
>
>    %3 = load i32* %add.ptr, align 4, !tbaa !11
>
>    %4 = insertelement <2 x i32> undef, i32 %3, i32 0
>
>    %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer
>
>    store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !9
>
>    ret void
>
> }
>
> !9 = metadata !{metadata !"omnipotent char", metadata !10}
>
> !10 = metadata !{metadata !"Simple C/C++ TBAA", null}
>
> !11 = metadata !{metadata !"int", metadata !9}
>
> In this example, the sequence of stores is copying the data from %src into
> %sPrivateStorage with the GEP of interest being:
>
>    %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0,
> i64 0, i64 %idx.ext
>
> After running the line:
>
> opt.exe -scalarrepl scalar_repl_store_delete.ll -o=scalar_repl_store_delete_after.bc
>
> We get:
>
> (scalar_repl_store_delete_after.ll):
>
> ; ModuleID = 'scalar_repl_store_delete_after.bc'
>
> target datalayout =
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64"
>
> define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32
> %alignmentOffsets) nounwind alwaysinline {
>
> entry:
>
>    %sPrivateStorage.0 = alloca <2 x i32>, align 8
>
>    %0 = load <2 x i32>* %src, align 8, !tbaa !0
>
>    store <2 x i32> %0, <2 x i32>* %sPrivateStorage.0, align 8, !tbaa !0
>
>    %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1
>
>    %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !0
>
>    %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2
>
>    %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !0
>
>    %idx.ext = zext i32 %alignmentOffsets to i64
>
>    %add.ptr = getelementptr inbounds <2 x i32>* %sPrivateStorage.0, i32 0, i64
> %idx.ext
>
>    %3 = load i32* %add.ptr, align 4, !tbaa !2
>
>    %4 = insertelement <2 x i32> undef, i32 %3, i32 0
>
>    %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer
>
>    store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !0
>
>    ret void
>
> }
>
> !0 = metadata !{metadata !"omnipotent char", metadata !1}
>
> !1 = metadata !{metadata !"Simple C/C++ TBAA", null}
>
> !2 = metadata !{metadata !"int", metadata !0}
>
> The second two stores are deleted because they appear to be dead even though
> that data can actually be reached by the out of bounds vector index in the GEP.
> What is expected in this case?
>
> Thanks,
>
> Scott
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>