[LLVMdev] Vector select/compare support in LLVM

Wed Mar 9 11:58:45 PST 2011

"Rotem, Nadav" <nadav.rotem at intel.com> writes:

> I can think of two ways to represent masks in x86: sparse and
> packed. In the sparse method, the masks are kept in <4 x 32bit>
> registers, which are mapped to xmm registers. This is the ‘native’ way
> of using masks.  

This argues for the sparse representation, I think.

> _Sparse_ After my discussion with Duncan, last week, I started working
> on the promotion of type <4 x i1> to <4 x i32>, and I ran into a
> problem.  It looks like the codegen term ‘promote’ is overloaded.

Heavily.  :-/

>  For scalars, the ‘promote’ operation converts scalars to larger
> bit-width scalars.  For vectors, the ‘promote’ operation widens the
> vector to the next power of two.  This is reasonable for types such as
> ‘<3 x float>’.  Maybe we need to add another legalization operation which
> will mean widening the vectors?

You mean widening the element type, correct?  Yes, that's definitely a
useful concept.

>  In any case, I estimated that implementing this per-element promotion
> would require major changes and decided that this is not the way to
> go.

What major changes?  I think this will end up giving much better code in
the end.  The pack/unpack operations could be very expensive.

There is another huge cost in using GPRs to hold masks.  There will be
fewer GPRs to hold addresses, which is a precious resource.  We should
avoid doing anything that uses more of that resource unnecessarily.

                             -Dave