[LLVMdev] Enabling Vector-select
nadav.rotem at intel.com
Sun Oct 16 08:25:03 PDT 2011
Thanks for trying the patches! Regarding LRB, it has a special <16 x i1> mask register, so <16 x i1> types would map naturally. As a general rule, when vectorizing, the vectorizing factor should match the width of the machine. But to answer your question, the type legalizer would try to widen <4 x i1> into <4 x 128> (to fill the 512-bit register size), but since i128 is not a legal scalar type, it would fail in doing so, and just widen the vector using the 'vector widening' code.
BTW, floating point types are vector-widended, just like before, and not element-promoted.
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Ralf Karrenberg
Sent: Sunday, October 16, 2011 16:54
To: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Enabling Vector-select
great work, thanks a lot!
I did not have the time to migrate our OpenCL driver to the latest trunk
yet, but I followed your commits and tried out some small tests which
worked as expected :).
The last thing missing for us now is AVX support in the JIT, but that is
a different issue.
However, there is one thing I do not fully understand: what if somebody
actually wants a vector of 4 boolean values (i1) that should not be
legalized to v4i32? For example, a code generator for LRBni would want
to use the architecture's predicate registers for masks, in which case
<16 x i1> should probably not be legalized to <16 x i32>, right?
However, I reckon that native support of architectures with predicated
execution is probably a bigger problem, anyway.
On 10/16/11 1:09 PM, Rotem, Nadav wrote:
> Hello everyone,
> I wanted to let everybody know that I am going to enable the support for vector-select by default later today.
> Currently the LLVM code-generator only supports 'select'  instructions with a boolean condition. Vectorizing compilers, such as the Intel OpenCL Vectorizer and the GCC vectorizer often use vector-select instructions to implements masks. This change makes code-generation for these patterns possible.
> In order to enable vector-select we needed to make some changes to the LLVM type-legalizer.
> The '-promote-elements' flag changes the way illegal vectors are legalized. Currently, the default legalization algorithm widens the number of elements in a vector. So, the vector v4i8 would be converted to v16i8. Using the 'promote-element' flag, the legalizer would first try to widen each element. So, the vector v4i8 would be converted to v4i32. Overall this is a good idea because the instruction set is usually more complete for the 'common' element type. This change is required in order to legalize mask types such as '<4 x i1>' into the types which are used by the SSE and Neon instruction sets.
> The X86 backend already has excellent codegen support and it lowers vector-select instructions to SSE4 and AVX blends. Other targets emulate blends using a sequence of ANDs and Xors.
> Later today I will fix a few tests (which expect a slightly different output) and enable the '-promote-element' flag by default.
>  http://llvm.org/docs/LangRef.html#i_select
>  https://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-blend.ll?revision=139992
> Intel Israel (74) Limited
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the llvm-dev