[LLVMdev] Altivec vs the type legalizer
Bob Wilson
bob.wilson at apple.com
Mon Nov 9 21:49:07 PST 2009
On Nov 9, 2009, at 6:11 PM, Dale Johannesen wrote:
> PPC Altivec supports vector type v16i8 (and others) where the element
> type is not legal (in llvm's implementation). When we have a
> BUILD_VECTOR of these types with constant elements, LegalizeTypes
> first promotes the element types to i32, then builds a constant pool
> entry of type v16i32. This is wrong. I can fix it by truncating the
> elements back to i8 in ExpandBUILD_VECTOR. Does this seem like the
> right approach? I ask because we'll be relying on ConstantVector::get
> and getConstantPool to work even with elements of a type that's
> illegal for the target; currently, they do. (And I see no other way
> to fix it except to break the vector into scalars, which produces
> horrendous code.)
>
>
> typedef vector unsigned char vuint8_t;
> vuint8_t baz;
> void foo(vuint8_t x) {
> vuint8_t temp = (vuint8_t)(22,21,20, 3, 25,24,23, 3, 28,27,26, 3,
> 31,30,29, 3);
> baz = vec_add(x, temp);
> }
Earlier this year we ran into a similar problem for NEON and ended up modifying the type rules for BUILD_VECTOR so that the operand types do not need to mach the element types. It is possible that what you are seeing is fall-out from that change.
With the "new" BUILD_VECTOR rules, you can have a v16i8 BUILD_VECTOR with 16 operands of type i32, and the operands are implicitly truncated to i8. Before we made that change, the type legalizer would (for the v16i8 case, e.g.) translate to a legal BUILD_VECTOR by using a bunch of shift and mask operations to pack 4 elements into each i32 value. When the target (like NEON) can directly support the smaller element types, it is a big mess to try to undo the shifting and masking to figure out what the original i8 operands were before type legalization.
ExpandBUILD_VECTOR seems like the right place to handle this, but I would not do it by operating on illegal types. We should add code there to combine the individual elements into values of the smallest legal integer type (e.g., for v16i8, pack 4 operands into each i32 value) and then bitcast the result. I think this used to happen upstream in type legalization, and now some more work is needed later on.
More information about the llvm-dev
mailing list