[LLVMdev] Crash in SLP for vector data type as function argument.
Shahid, Asghar-ahmad
Asghar-ahmad.Shahid at amd.com
Wed Jan 7 08:55:21 PST 2015
Hi Suyog,
Since CanReuseExtract(E->Scalars) checks properly the possibility of reusing the operand zero of
"extractelement", using below code in emitReduction may help resolve this issue.
emitReduction(...) {
...
Instruction *ValToReduce = dyn_cast<Instruction>(VectorizedValue);
================================================
Value *TmpVec;
bool isVecTy = false;
if(!ValToReduce) {
isVecTy = VectorizedValue->getType()->isVectorTy();
}
if(!ValToReduce && isVecTy) {
TmpVec = VectorizedValue;
} else {
TmpVec = ValToReduce;
}
=================================================
....
}
Regards,
Shahid
> -----Original Message-----
> From: Suyog Kamal Sarda [mailto:suyog.sarda at samsung.com]
> Sent: Wednesday, January 07, 2015 5:05 PM
> To: Shahid, Asghar-ahmad; nrotem at apple.com; aschwaighofer at apple.com;
> mzolotukhin at apple.com; james.molloy at arm.com
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: RE: [LLVMdev] Crash in SLP for vector data type as function
> argument.
>
> Hi Shahid,
>
> Thanks for the reply.
>
> Actually, yes, the emitreduction() takes vectorizedvalue which is leaf of the
> tree. '
> I got confused by the name of the argument passed while calling
> emitReduction().
>
> Value *ReducedSubTree = emitReduction(VectorizedRoot, Builder)
>
> Anyways, that should hardly matter.
>
> I had mentioned the test case :
>
> int foo(uint32x4_t a) {
> return a[0] + a[1] + a[2] + a[3];
> }
>
> LLVM IR :
>
> define i32 @hadd(<4 x i32> %a) {
> entry:
> %vecext = extractelement <4 x i32> %a, i32 0
> %vecext1 = extractelement <4 x i32> %a, i32 1
> %add = add i32 %vecext, %vecext1
> %vecext2 = extractelement <4 x i32> %a, i32 2
> %add3 = add i32 %add, %vecext2
> %vecext4 = extractelement <4 x i32> %a, i32 3
> %add5 = add i32 %add3, %vecext4
> ret i32 %add5
> }
>
> Now, when leaf %vecext is reached, the vectorizeTree() function call sets the
> VectorizedValue to 0th operand of extractelement instruction.
>
> case Instruction::ExtractElelement: {
> if(CanReuseExtract(E->Scalars)) {
> Value *V = VL0->getOperand(0);
> E->VectorizedValue = V;
> return V;
> }
> return Gather(E->Scalars, VecTy);
> }
>
> Now in emitReduction(), the VectorizedValue is dyn_cast to Instruction.
> In above IR, %a is not an instruction (function argument), hence while
> referring the casted value which is null, crash occurs.
>
> Instruction *ValToReduce = dyn_cast<Instruction>(VectorizedValue);
>
> Note : The above test case won't crash with current svn version, since code
> for parsing the tree for above IR is yet to be included in svn. Initial patch was
> submitted in http://reviews.llvm.org/D6818.
> I am working on refining it, however, the above code flow is not disturbed at
> all in my patch of parsing.
> You can try to reproduce the problem by importing above patch in local code.
>
> When the vector data type 'a' is in global scope, a 'load' instruction is
> generated in basic block of the function:
>
> test case 2:
>
> unint32x4_t a;
> int foo() {
> return a[0] + a[1] + a[2] + a[3];
> }
>
> IR for above test case :
>
> @a = common global <4 x i32> zeroinitializer, align 16
>
> define i32 @hadd() #0 {
> entry:
> %0 = load <4 x i32>* @a, align 16, !tbaa !1
> %vecext = extractelement <4 x i32> %0, i32 0
> %vecext1 = extractelement <4 x i32> %0, i32 1
> %add = add i32 %vecext, %vecext1
> %vecext2 = extractelement <4 x i32> %0, i32 2
> %add3 = add i32 %add, %vecext2
> %vecext4 = extractelement <4 x i32> %0, i32 3
> %add5 = add i32 %add3, %vecext4
> ret i32 %add5
> }
>
> Now, since here, 0th operand of leaf %vecext is a load instruction, the
> dyn_casting into an instruction will succeed here and reduction will be
> emitted properly.
>
> How can we solve this problem? What type of casting should a function
> argument belong to?
>
> Regards,
> Suyog
>
>
>
> ------- Original Message -------
> Sender : Shahid, Asghar-ahmad<Asghar-ahmad.Shahid at amd.com>
> Date : Jan 07, 2015 20:05 (GMT+09:00)
> Title : RE: [LLVMdev] Crash in SLP for vector data type as function argument.
>
> Hi Suyog,
>
> IMO emitReduction() takes a vectorized value which is the leafs of the
> matched pattern/tree.
> So what you are thinking as root is actually the leaf of the tree.
> Root should actually be the value which is being feed to the "return"
> statement.
>
> It would be of great help if you could, share the sample test?
>
> Regards,
> Shahid
>
> > -----Original Message-----
> > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-
> bounces at cs.uiuc.edu]
> > On Behalf Of Suyog Kamal Sarda
> > Sent: Monday, January 05, 2015 5:40 PM
> > To: nrotem at apple.com; aschwaighofer at apple.com;
> mzolotukhin at apple.com;
> > james.molloy at arm.com
> > Cc: llvmdev at cs.uiuc.edu
> > Subject: [LLVMdev] Crash in SLP for vector data type as function argument.
> >
> > Hi all,
> >
> > Came across a crash in SLP vectorization while testing following code
> > for
> > AArch64 :
> >
> > int foo(uint32x4_t a) {
> > return a[0] + a[1] + a[2] + a[3];
> > }
> >
> > The LLVM IR for above code will be:
> >
> > define i32 @hadd(<4 x i32> %a) {
> > entry:
> > %vecext = extractelement <4 x i32> %a, i32 0
> > %vecext1 = extractelement <4 x i32> %a, i32 1
> > %add = add i32 %vecext, %vecext1
> > %vecext2 = extractelement <4 x i32> %a, i32 2
> > %add3 = add i32 %add, %vecext2
> > %vecext4 = extractelement <4 x i32> %a, i32 3
> > %add5 = add i32 %add3, %vecext4
> > ret i32 %add5
> > }
> >
> > I somehow try to recognize this pattern and try to vectorize it using
> > existing code for horizontal reductions (I just recognize the pattern
> > and fill up the data, rest is done by already existing code.
> > I do pattern matching very badly though, but that's a different story).
> >
> >
> > Please note that whatever follows is with existing code, I haven't
> > modified any bit of it.
> >
> > Now, once the pattern is recognized, we call "trytoReduce()" where we
> > try to vectorize tree by function call "vectorizeTree()" which returns
> > root of the vectorized tree. Then we emit the reduction using call
> "emitRedcution()"
> > which takes the root of the vector tree as argument. Inside
> > "emitReduction()", we cast root of the tree into an instruction.
> >
> > Now, for above case, while setting the root of the vectorized tree,
> > extractelement instruction is encountered, and its 0th operand is set
> > as the root of the tree, which in above case is "%a". However, this is
> > not an instruction and hence, when we cast it into an instruction in
> > "emitReduction()" code, it returns nullptr which causes a crash ahead
> > when referencing it.
> >
> > Take a second case where the vector data type is in global scope.
> >
> > unint32x4_t a;
> > int foo() {
> > return a[0] + a[1] + a[2] + a[3];
> > }
> >
> > The IR for above code is:
> >
> > @a = common global <4 x i32> zeroinitializer, align 16
> >
> > define i32 @hadd() #0 {
> > entry:
> > %0 = load <4 x i32>* @a, align 16, !tbaa !1
> > %vecext = extractelement <4 x i32> %0, i32 0
> > %vecext1 = extractelement <4 x i32> %0, i32 1
> > %add = add i32 %vecext, %vecext1
> > %vecext2 = extractelement <4 x i32> %0, i32 2
> > %add3 = add i32 %add, %vecext2
> > %vecext4 = extractelement <4 x i32> %0, i32 3
> > %add5 = add i32 %add3, %vecext4
> > ret i32 %add5
> > }
> >
> > Now in above case, 0th operand of extractelement %0 is a load
> > instruction, and hence it doesn't crash while casting into an
> > instruction and runs smoothly further.
> >
> > Can someone please suggest how to resolve this? Is there something I
> > am missing or is it a basic problem with IR itself ?
> >
> > Regards,
> > Suyog
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list