[LLVMdev] Failure to optimize vector select

Matt Arsenault arsenm2 at gmail.com
Mon Aug 19 18:04:35 PDT 2013


Hi,

I've found a case I would expect would optimize easily, but it doesn't. A simple implementation of vector select:

float4 simple_select(float4 a, float4 b, int4 c)
{
    float4 result;

    result.x = c.x ? a.x : b.x; 
    result.y = c.y ? a.y : b.y;
    result.z = c.z ? a.z : b.z;
    result.w = c.w ? a.w : b.w;

    return result;
}

I would expect this would be optimized to

%bool = icmp eq <4 x i32> %c, 0
%result = select <4 x i1> %bool, <4 x float> %a, <4x float> %b
ret <4 x float> %result

However, it actually ends up as the 4 separate extractelement/icmp/select sequence.

Where would be the best place to fix this? Should InstCombine be taking care of this or the vectorizer?


Thanks



More information about the llvm-dev mailing list