[LLVMdev] Predication on SIMD architectures and LLVM
David Chisnall
David.Chisnall at cl.cam.ac.uk
Tue Oct 23 09:43:34 PDT 2012
I am talking about the LLVM select instruction, not a vector select:
http://llvm.org/docs/LangRef.html#i_select
In any non-trapping case, an arithmetic operation (or sequence of operations) followed by a select is semantically equivalent to the predicated version. This is exactly how predicated instructions on ARM are handled. For example, the following IR:
%cmp = icmp sgt i32 %c, %b
%add = add nsw i32 %b, 1
%add1 = add nsw i32 %c, 2
%retval.0 = select i1 %cmp, i32 %add, i32 %add1
Becomes this ARM assembly:
add r2, r1, #2
cmp r1, r0
addgt r2, r0, #1
mov r0, r2
An equally valid form would be:
cmp r1, r0
addle r2, r1, #2
addgt r2, r0, #1
mov r0, r2
Separating the select, which embodies the predication, from the operations allows more choice in terms of the final representation. Unless the load or store is volatile, the compiler is free to elide it if its result is not used, and is most definitely free to fold it into a predicated load. The same is obviously true of any side-effect-free operations, such as divides and square roots: folding them into predicated instructions is no less invalid than conditionally executing them in branches or removing them entirely via dead code elimination.
Just because the generated machine code must contain predicated instructions most definitely does mean that the LLVM IR must contain it, or even that we would gain anything in terms of expressive power by permitting it.
David
On 23 Oct 2012, at 17:25, <dag at cray.com> wrote:
> David Chisnall <David.Chisnall at cl.cam.ac.uk> writes:
>
>> Perhaps I am missing something, but isn't a predicated instruction
>> effectively an single-instruction version of an arithmetic operation
>> followed by a select?
>
> No, it is not. Among other things, predication is used to avoid traps.
> A vector select is an entirely different operation.
>
>> As we can already represent this in the IR, and already match other
>> predicated instructions (e.g. on ARM) to this pattern, what is gained
>> by adding predication directly to the IR?
>
> Predicated loads, stores, divides, sqrts, etc. are essential for
> correctly vectorizing loops with conditionals due to safety concerns.
> If the loop body has no dangerous operations, then yes, a vector select
> can be used without problems but it is often slower than predication.
> Usually the hardware can optimize instructions with certain values of
> predicates.
>
> -David
More information about the llvm-dev
mailing list