[LLVMdev] Vector instructions
Chris Lattner
sabre at nondot.org
Fri Jun 27 12:13:28 PDT 2008
On Jun 27, 2008, at 8:02 AM, Stefanus Du Toit wrote:
>>>> <result> = shufflevector <a x <ty>> <v1>, <b x <ty>> <v2>, <d x
>>>> i32>
>>>> <mask> ; yields <d x <ty>>
>>>
>>> With the requirement that the entries in the (still constant) mask
>>> are
>>> within the range of [0, a + b - 1].
>
>> The alternative is to have the frontend synthesize the needed
>> operations with extracts, inserts, and possibly shuffles if needed.
>> LLVM is actually fairly well prepared to optimize code like this.
>> I recommend giving this a try, and reporting any problems you
>> encounter.
>
> That certainly appears to be the only option at the moment, and we'll
> have a look to see how that works out. However, note that a
> sufficiently generalized shufflevector would remove the need for
> insertelement and extractelement to exist completely.
You should look into how this works with clang. Clang allows you to
do things like this, for example:
typedef __attribute__(( ext_vector_type(4) )) float float4;
float2 vec2, vec2_2;
float4 vec4, vec4_2;
float f;
void test2() {
vec2 = vec4.xy; // shorten
f = vec2.x; // extract elt
vec4 = vec4.yyyy; // splat
vec4.zw = vec2; // insert
}
etc. It also offers operators to extract all the even or odd elements
of a vector, do arbitrary two-input-vector shuffles with
__builtin_shuffle etc.
>>> 2. vector select
>>> 3. vector trunc, sext, zext, fptrunc, fpext
>>> 4. vector shl, lshr, ashr
>> [...]
>>
>> We agree that these would be useful. There are intentions to add them
>> to LLVM; others can say more.
>
> OK. I'd love to hear more, especially if someone is planning to do
> this in the short term.
Most of the extensions you suggest are great ideas, but we need more
than ideas: we need someone to help implement the ideas ;-).
>> It turns out that having them return vectors of i1 would be somewhat
>> complicated. For example, a <4 x i1> on an SSE2 target could expand
>> either to 1 <4 x i32> or 2 <2 x i64>s, and the optimal thing would
>> be to make the decision based on the comparison that produced them,
>> but LLVM isn't yet equipped for that.
>
> Can you expand on this a bit? I'm guessing you're referring to
> specific mechanics in LLVM's code generation framework making this
> difficult to do?
I'm not sure that it really is a matter of what is easy or hard to do
in LLVM. This model more closely matches the model implemented by
common SIMD systems like Altivec, SSE, CellSPU, Alpha, etc. The
design of the system was picked to model the systems that we know of
well, we obviously can't plan to handle systems that we don't know
about.
>> vicmp and vfcmp are very much aimed at solving practical problems on
>> popular architectures without needing significant new infrastructure
>> They're relatively new, and as you say, they'll be more useful when
>> combined with vector shifts and friends and we start teaching LLVM
>> to recognize popular idioms with them.
>
> Can you give me examples of how they are used today at all? I'm having
> a really hard time figuring out a good use for them (that doesn't
> involve effectively scalarizing them immediately) without these other
> instructions.
They can be used with target-specific intrinsics. For example, SSE
provides a broad range of intrinsics to support instructions that LLVM
IR can't express well. See llvm/include/llvm/IntrinsicsX86.td for
more details.
If you're interested in helping shape the direction of LLVM vector
support, my advice is that "patches speak louder than words" :). I'd
love to see improved vector support in LLVM, but unless someone is
willing to step forward and implement it, it is all just idle talk.
-Chris
More information about the llvm-dev
mailing list