[LLVMdev] passing vector of booleans to functions
Roland Leißa
leissa at cs.uni-saarland.de
Mon Feb 25 18:48:26 PST 2013
Hi all,
I'm currently trying to figure out the best way to pass vector of
booleans to other functions. Take this small example:
define <4 x float> @vcmp_add(<4 x float> %a, <4 x float> %b) {
entry:
%cmp = fcmp olt <4 x float> %a, %b
%add = fadd <4 x float> %a, %b
%sel = select <4 x i1> %cmp, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
I will get (on SSE):
movaps %xmm0, %xmm2
cmpltps %xmm1, %xmm0
addps %xmm2, %xmm1
blendvps %xmm1, %xmm2
movaps %xmm2, %xmm0
ret
great :)
But now, let us try to pass a mask to a function.
define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float> %a, <4 x float> %b) {
entry:
%add = fadd <4 x float> %a, %b
%sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
I will get:
addps %xmm1, %xmm2
pslld $31, %xmm0
blendvps %xmm2, %xmm1
movaps %xmm1, %xmm0
ret
While this is correct and works, I'm unhappy with the pssld. Apparently,
LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask
bit. But blendvps expects the MSB as mask bit and therefore the shift.
OK, let's try better. This time, I will directly use <4 x i32>:
define <4 x float> @masked_add_32(<4 x i32> %mask, <4 x float> %a, <4 x float> %b)
{
entry:
%add = fadd <4 x float> %a, %b
%trunc = trunc <4 x i32> %mask to <4 x i1>
%sel = select <4 x i1> %trunc, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
But damn, I have to truncate the mask in order to use the select. So in
the end, LLVM will produce the same code as above. So what code do I
have to use, in order to get rid of the shift?
If there would be a way to somehow tell LLVM that each element of %mask
is guaranteed to be 0xFFFFFFFF or 0x0...
Thanks,
Roland
More information about the llvm-dev
mailing list