[LLVMdev] Crash in SLP for vector data type as function argument.

Shahid, Asghar-ahmad Asghar-ahmad.Shahid at amd.com
Wed Jan 7 08:55:21 PST 2015


Hi Suyog,

Since CanReuseExtract(E->Scalars) checks properly the possibility of reusing the operand zero of
"extractelement", using below code in emitReduction may help resolve this issue.
emitReduction(...)  {
...
Instruction *ValToReduce = dyn_cast<Instruction>(VectorizedValue);
================================================
               Value *TmpVec;
	bool isVecTy = false;
	if(!ValToReduce) {
	  isVecTy = VectorizedValue->getType()->isVectorTy();
	}
	if(!ValToReduce && isVecTy) {
	  TmpVec = VectorizedValue;
	} else {
	  TmpVec = ValToReduce;
	}
=================================================
....
}

Regards,
Shahid

> -----Original Message-----
> From: Suyog Kamal Sarda [mailto:suyog.sarda at samsung.com]
> Sent: Wednesday, January 07, 2015 5:05 PM
> To: Shahid, Asghar-ahmad; nrotem at apple.com; aschwaighofer at apple.com;
> mzolotukhin at apple.com; james.molloy at arm.com
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: RE: [LLVMdev] Crash in SLP for vector data type as function
> argument.
> 
> Hi Shahid,
> 
> Thanks for the reply.
> 
> Actually, yes, the emitreduction() takes vectorizedvalue which is leaf of the
> tree. '
> I got confused by the name of the argument passed while calling
> emitReduction().
> 
> Value *ReducedSubTree = emitReduction(VectorizedRoot, Builder)
> 
> Anyways, that should hardly matter.
> 
> I had mentioned the test case :
> 
> int foo(uint32x4_t a) {
>   return a[0] + a[1] + a[2] + a[3];
> }
> 
> LLVM IR :
> 
> define i32 @hadd(<4 x i32> %a) {
>  entry:
>    %vecext = extractelement <4 x i32> %a, i32 0
>    %vecext1 = extractelement <4 x i32> %a, i32 1
>   %add = add i32 %vecext, %vecext1
>    %vecext2 = extractelement <4 x i32> %a, i32 2
>    %add3 = add i32 %add, %vecext2
>    %vecext4 = extractelement <4 x i32> %a, i32 3
>    %add5 = add i32 %add3, %vecext4
>    ret i32 %add5
>  }
> 
> Now, when leaf %vecext is reached, the vectorizeTree() function call sets the
> VectorizedValue to 0th operand of extractelement instruction.
> 
> case Instruction::ExtractElelement: {
>   if(CanReuseExtract(E->Scalars)) {
>        Value *V = VL0->getOperand(0);
>         E->VectorizedValue = V;
>         return V;
>      }
>     return Gather(E->Scalars, VecTy);
> }
> 
> Now in emitReduction(), the VectorizedValue is dyn_cast to Instruction.
> In above IR, %a is not an instruction (function argument), hence while
> referring the casted value which is null, crash occurs.
> 
> Instruction *ValToReduce = dyn_cast<Instruction>(VectorizedValue);
> 
> Note : The above test case won't crash with current svn version, since code
> for parsing the tree for above IR is yet to be included in svn. Initial patch was
> submitted in http://reviews.llvm.org/D6818.
> I am working on refining it, however, the above code flow is not disturbed at
> all in my patch of parsing.
> You can try to reproduce the problem by importing above patch in local code.
> 
> When the vector data type 'a' is in global scope, a 'load' instruction is
> generated in basic block of the function:
> 
> test case 2:
> 
>  unint32x4_t a;
>  int foo() {
>   return a[0] + a[1] + a[2] + a[3];
>  }
> 
> IR for above test case :
> 
> @a = common global <4 x i32> zeroinitializer, align 16
> 
>  define i32 @hadd() #0 {
>  entry:
>    %0 = load <4 x i32>* @a, align 16, !tbaa !1
>    %vecext = extractelement <4 x i32> %0, i32 0
>    %vecext1 = extractelement <4 x i32> %0, i32 1
>    %add = add i32 %vecext, %vecext1
>   %vecext2 = extractelement <4 x i32> %0, i32 2
>    %add3 = add i32 %add, %vecext2
>    %vecext4 = extractelement <4 x i32> %0, i32 3
>    %add5 = add i32 %add3, %vecext4
>    ret i32 %add5
>  }
> 
> Now, since here, 0th operand of leaf %vecext is a load instruction, the
> dyn_casting into an instruction will succeed here and reduction will be
> emitted properly.
> 
> How can we solve this problem? What type of casting should a function
> argument belong to?
> 
> Regards,
> Suyog
> 
> 
> 
> ------- Original Message -------
> Sender : Shahid, Asghar-ahmad<Asghar-ahmad.Shahid at amd.com>
> Date : Jan 07, 2015 20:05 (GMT+09:00)
> Title : RE: [LLVMdev] Crash in SLP for vector data type as function argument.
> 
> Hi Suyog,
> 
> IMO emitReduction() takes a vectorized value which is the leafs of the
> matched pattern/tree.
> So what you are thinking as root is actually the leaf of the tree.
> Root should actually be the value which is being feed to the "return"
> statement.
> 
> It would be of great help if you could, share the sample test?
> 
> Regards,
> Shahid
> 
> > -----Original Message-----
> > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-
> bounces at cs.uiuc.edu]
> > On Behalf Of Suyog Kamal Sarda
> > Sent: Monday, January 05, 2015 5:40 PM
> > To: nrotem at apple.com; aschwaighofer at apple.com;
> mzolotukhin at apple.com;
> > james.molloy at arm.com
> > Cc: llvmdev at cs.uiuc.edu
> > Subject: [LLVMdev] Crash in SLP for vector data type as function argument.
> >
> > Hi all,
> >
> > Came across a crash in SLP vectorization while testing following code
> > for
> > AArch64 :
> >
> > int foo(uint32x4_t a) {
> >  return a[0] + a[1] + a[2] + a[3];
> > }
> >
> > The LLVM IR for above code will be:
> >
> > define i32 @hadd(<4 x i32> %a) {
> > entry:
> >   %vecext = extractelement <4 x i32> %a, i32 0
> >   %vecext1 = extractelement <4 x i32> %a, i32 1
> >   %add = add i32 %vecext, %vecext1
> >   %vecext2 = extractelement <4 x i32> %a, i32 2
> >   %add3 = add i32 %add, %vecext2
> >   %vecext4 = extractelement <4 x i32> %a, i32 3
> >   %add5 = add i32 %add3, %vecext4
> >   ret i32 %add5
> > }
> >
> > I somehow try to recognize this pattern and try to vectorize it using
> > existing code for horizontal reductions (I just recognize the pattern
> > and fill up the data, rest is done by already existing code.
> > I do pattern matching very badly though, but that's a different story).
> >
> >
> > Please note that whatever follows is with existing code, I haven't
> > modified any bit of it.
> >
> > Now, once the pattern is recognized, we call "trytoReduce()" where we
> > try to vectorize tree by function call "vectorizeTree()" which returns
> > root of the vectorized tree. Then we emit the reduction using call
> "emitRedcution()"
> > which takes the root of the vector tree as argument. Inside
> > "emitReduction()", we cast root of the tree into an instruction.
> >
> > Now, for above case, while setting the root of the vectorized tree,
> > extractelement instruction is encountered, and its 0th operand is set
> > as the root of the tree, which in above case is "%a". However, this is
> > not an instruction and hence, when we cast it into an instruction in
> > "emitReduction()" code, it returns nullptr which causes a crash ahead
> > when referencing it.
> >
> > Take a second case where the vector data type is in global scope.
> >
> > unint32x4_t a;
> > int foo() {
> >  return a[0] + a[1] + a[2] + a[3];
> > }
> >
> > The IR for above code is:
> >
> > @a = common global <4 x i32> zeroinitializer, align 16
> >
> > define i32 @hadd() #0 {
> > entry:
> >   %0 = load <4 x i32>* @a, align 16, !tbaa !1
> >   %vecext = extractelement <4 x i32> %0, i32 0
> >   %vecext1 = extractelement <4 x i32> %0, i32 1
> >   %add = add i32 %vecext, %vecext1
> >   %vecext2 = extractelement <4 x i32> %0, i32 2
> >   %add3 = add i32 %add, %vecext2
> >   %vecext4 = extractelement <4 x i32> %0, i32 3
> >   %add5 = add i32 %add3, %vecext4
> >   ret i32 %add5
> > }
> >
> > Now in above case, 0th operand of extractelement %0 is a load
> > instruction, and hence it doesn't crash while casting into an
> > instruction and runs smoothly further.
> >
> > Can someone please suggest how to resolve this? Is there something I
> > am missing or is it a basic problem with IR itself ?
> >
> > Regards,
> > Suyog
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list