[LLVMdev] Failure to optimize vector select
Matt Arsenault
arsenm2 at gmail.com
Mon Aug 19 18:04:35 PDT 2013
Hi,
I've found a case I would expect would optimize easily, but it doesn't. A simple implementation of vector select:
float4 simple_select(float4 a, float4 b, int4 c)
{
float4 result;
result.x = c.x ? a.x : b.x;
result.y = c.y ? a.y : b.y;
result.z = c.z ? a.z : b.z;
result.w = c.w ? a.w : b.w;
return result;
}
I would expect this would be optimized to
%bool = icmp eq <4 x i32> %c, 0
%result = select <4 x i1> %bool, <4 x float> %a, <4x float> %b
ret <4 x float> %result
However, it actually ends up as the 4 separate extractelement/icmp/select sequence.
Where would be the best place to fix this? Should InstCombine be taking care of this or the vectorizer?
Thanks
More information about the llvm-dev
mailing list