[LLVMdev] Altivec vs the type legalizer
Dale Johannesen
dalej at apple.com
Tue Nov 10 10:33:45 PST 2009
On Nov 10, 2009, at 1:20 AMPST, Duncan Sands wrote:
> Hi Dale, I think Bob is right: the type legalizer shouldn't be
> turning v16i8
> into v16i32, what should happen is that the return type of the
> BUILD_VECTOR
> continues to be v16i8, but the type of the operands changes to i32,
> so you
> end up with a BUILD_VECTOR that takes 16 lots of i32, and produces a
> v16i8.
It does that.
> The target then has all the info it needs to produce the best code,
> but needs
> to be careful not the use the operand type (i32) when it really
> wants the vector
> element type (i8).
I don't think it's target dependent. This is also broken on Neon; the
breakage is introduced when lowering a BUILD_VECTOR to a load from
ConstantPool, and the call that builds the ConstantPool does not
currently pass enough information to DTRT, it just passes a vector of
i32's. Try the following with -march=arm -mattr=+neon . (It is
possible that there's no way to get the FE to generate this on Neon,
however.)
; ModuleID = 'small.c'
target datalayout = "E-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-
i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-
f128:64:128-n32"
@baz = common global <16 x i8> zeroinitializer ; <<16 x i8>*>
[#uses=1]
define void @foo(<16 x i8> %x) nounwind ssp {
entry:
%x_addr = alloca <16 x i8> ; <<16 x i8>*>
[#uses=2]
%temp = alloca <16 x i8> ; <<16 x i8>*>
[#uses=2]
%"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0]
store <16 x i8> %x, <16 x i8>* %x_addr
store <16 x i8> <i8 22, i8 21, i8 20, i8 3, i8 25, i8 24, i8 23, i8
3, i8 28, i8 27, i8 26, i8 3, i8 31, i8 30, i8 29, i8 3>, <16 x i8>*
%temp, align 16
%0 = load <16 x i8>* %x_addr, align 16 ; <<16 x i8>>
[#uses=1]
%1 = load <16 x i8>* %temp, align 16 ; <<16 x i8>>
[#uses=1]
%tmp = add <16 x i8> %0, %1 ; <<16 x i8>>
[#uses=1]
store <16 x i8> %tmp, <16 x i8>* @baz, align 16
br label %return
return: ; preds = %entry
ret void
}
To make things more concrete here is the patch I was trying out:
Index: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
===================================================================
--- lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (revision 85265)
+++ lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (working copy)
@@ -1810,7 +1810,16 @@
CV.push_back(const_cast<ConstantFP *>(V-
>getConstantFPValue()));
} else if (ConstantSDNode *V =
dyn_cast<ConstantSDNode>(Node->getOperand(i))) {
- CV.push_back(const_cast<ConstantInt *>(V-
>getConstantIntValue()));
+ if (OpVT==EltVT)
+ CV.push_back(const_cast<ConstantInt *>(V-
>getConstantIntValue()));
+ else {
+ // If OpVT and EltVT don't match, EltVT is not legal and the
+ // element values have been promoted/truncated earlier.
Undo this;
+ // we don't want a v16i8 to become a v16i32 for example.
+ const ConstantInt *CI = V->getConstantIntValue();
+
CV.push_back(ConstantInt::get(EltVT.getTypeForEVT(*DAG.getContext()),
+ CI->getZExtValue()));
+ }
} else {
assert(Node->getOperand(i).getOpcode() == ISD::UNDEF);
const Type *OpNTy = OpVT.getTypeForEVT(*DAG.getContext());
> While Bob describes this as being new rules, IIRC the old type
> legalizer used
> to handle BUILD_VECTOR with an illegal element type needing
> promotion the same
> way: it just promoted the operands, resulting in a mismatch between
> the operand
> type and the vector element type (unlike in the bad old days, I
> believe this is
> now documented as being allowed, in the description of the
> BUILD_VECTOR node).
Right.
> However in the case of PPC I think the PPC code grabbed the
> BUILD_VECTOR and
> transformed it before it hit this logic in the old type legalizer.
> Now that
> type legalization is being done first, probably this got reversed:
> first the
> type legalizer logic transforms the BUILD_VECTOR, and only later the
> ppc code
> gets to do its stuff.
I don't think so; by the time it reaches ExpandBUILD_VECTOR it's been
through, and rejected, a bunch of special cases in the PPCISel code.
More information about the llvm-dev
mailing list