[LLVMdev] Altivec vs the type legalizer

Tue Nov 10 10:33:45 PST 2009

On Nov 10, 2009, at 1:20 AMPST, Duncan Sands wrote:

> Hi Dale, I think Bob is right: the type legalizer shouldn't be  
> turning v16i8
> into v16i32, what should happen is that the return type of the  
> BUILD_VECTOR
> continues to be v16i8, but the type of the operands changes to i32,  
> so you
> end up with a BUILD_VECTOR that takes 16 lots of i32, and produces a  
> v16i8.

It does that.

> The target then has all the info it needs to produce the best code,  
> but needs
> to be careful not the use the operand type (i32) when it really  
> wants the vector
> element type (i8).

I don't think it's target dependent.  This is also broken on Neon; the  
breakage is introduced when lowering a BUILD_VECTOR to a load from  
ConstantPool, and the call that builds the ConstantPool does not  
currently pass enough information to DTRT, it just passes a vector of  
i32's.  Try the following with  -march=arm -mattr=+neon .  (It is  
possible that there's no way to get the FE to generate this on Neon,  
however.)


; ModuleID = 'small.c'
target datalayout = "E-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32- 
i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64- 
f128:64:128-n32"

@baz = common global <16 x i8> zeroinitializer    ; <<16 x i8>*>  
[#uses=1]

define void @foo(<16 x i8> %x) nounwind ssp {
entry:
   %x_addr = alloca <16 x i8>                      ; <<16 x i8>*>  
[#uses=2]
   %temp = alloca <16 x i8>                        ; <<16 x i8>*>  
[#uses=2]
   %"alloca point" = bitcast i32 0 to i32          ; <i32> [#uses=0]
   store <16 x i8> %x, <16 x i8>* %x_addr
   store <16 x i8> <i8 22, i8 21, i8 20, i8 3, i8 25, i8 24, i8 23, i8  
3, i8 28, i8 27, i8 26, i8 3, i8 31, i8 30, i8 29, i8 3>, <16 x i8>*  
%temp, align 16
   %0 = load <16 x i8>* %x_addr, align 16          ; <<16 x i8>>  
[#uses=1]
   %1 = load <16 x i8>* %temp, align 16            ; <<16 x i8>>  
[#uses=1]
   %tmp = add <16 x i8> %0, %1                     ; <<16 x i8>>  
[#uses=1]
   store <16 x i8> %tmp, <16 x i8>* @baz, align 16
   br label %return

return:                                           ; preds = %entry
   ret void
}

To make things more concrete here is the patch I was trying out:

Index: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
===================================================================

--- lib/CodeGen/SelectionDAG/LegalizeDAG.cpp    (revision 85265)
+++ lib/CodeGen/SelectionDAG/LegalizeDAG.cpp    (working copy)
@@ -1810,7 +1810,16 @@
          CV.push_back(const_cast<ConstantFP *>(V- 
 >getConstantFPValue()));
        } else if (ConstantSDNode *V =
                   dyn_cast<ConstantSDNode>(Node->getOperand(i))) {
-        CV.push_back(const_cast<ConstantInt *>(V- 
 >getConstantIntValue()));
+        if (OpVT==EltVT)
+          CV.push_back(const_cast<ConstantInt *>(V- 
 >getConstantIntValue()));
+        else {
+          // If OpVT and EltVT don't match, EltVT is not legal and the
+          // element values have been promoted/truncated earlier.   
Undo this;
+          // we don't want a v16i8 to become a v16i32 for example.
+          const ConstantInt *CI = V->getConstantIntValue();
+           
CV.push_back(ConstantInt::get(EltVT.getTypeForEVT(*DAG.getContext()),
+                                        CI->getZExtValue()));
+        }
        } else {
          assert(Node->getOperand(i).getOpcode() == ISD::UNDEF);
          const Type *OpNTy = OpVT.getTypeForEVT(*DAG.getContext());

> While Bob describes this as being new rules, IIRC the old type  
> legalizer used
> to handle BUILD_VECTOR with an illegal element type needing  
> promotion the same
> way: it just promoted the operands, resulting in a mismatch between  
> the operand
> type and the vector element type (unlike in the bad old days, I  
> believe this is
> now documented as being allowed, in the description of the  
> BUILD_VECTOR node).

Right.

> However in the case of PPC I think the PPC code grabbed the  
> BUILD_VECTOR and
> transformed it before it hit this logic in the old type legalizer.   
> Now that
> type legalization is being done first, probably this got reversed:  
> first the
> type legalizer logic transforms the BUILD_VECTOR, and only later the  
> ppc code
> gets to do its stuff.

I don't think so; by the time it reaches ExpandBUILD_VECTOR it's been  
through, and rejected, a bunch of special cases in the PPCISel code.