[LLVMdev] Canonicalizing vector masking.

Nadav Rotem nrotem at apple.com
Wed Oct 8 10:45:44 PDT 2014


I think that the pattern below should be canonicalized into a vector ’select’ instruction with a constant mask.  I think that we already have code for canonicalizing select-like shuffles into selects.  

> On Oct 6, 2014, at 12:36 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> 
> On 26 September 2014 19:22, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote:
>> Hi, I received an internal test case from a game team (it wasn't about this
>> in particular), and I was wondering if there was maybe an opportunity to
>> canonicalize a particular code pattern:
>> 
>>  %inputi = bitcast <4 x float> %input to <4 x i32>
>> 
>>  %row0i = and <4 x i32> %inputi, <i32 -1, i32 0, i32 0, i32 0>
>>  %row0 = bitcast <4 x i32> %row0i to <4 x float>
>> 
>>  %row1i = and <4 x i32> %inputi, <i32 0, i32 -1, i32 0, i32 0>
>>  %row1 = bitcast <4 x i32> %row1i to <4 x float>
>> 
>>  %row2i = and <4 x i32> %inputi, <i32 0, i32 0, i32 -1, i32 0>
>>  %row2 = bitcast <4 x i32> %row2i to <4 x float>
>> 
>>  %row3i = and <4 x i32> %inputi, <i32 0, i32 0, i32 0, i32 -1>
>>  %row3 = bitcast <4 x i32> %row3i to <4 x float>
>> 
>> This arises from code which expands a vector of scale factors into the
>> diagonal of a 4x4 diagonal matrix. This code pattern is coming from
>> intrinsics which are explicitly doing the masking like this.
>> 
>> My question is: should we canonicalize this to:
>> 
>>  %row0 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
>> x i32> <i32 0, i32 4, i32 4, i32 4>
>>  %row1 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
>> x i32> <i32 4, i32 1, i32 4, i32 4>
>>  %row2 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
>> x i32> <i32 4, i32 4, i32 2, i32 4>
>>  %row3 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
>> x i32> <i32 4, i32 4, i32 4, i32 3>
>> 

I think that there is a bug in the shuffle pattern. It should be <i32 4, i32 5, i32 6, i32 3>. 

>> which seems to better express the intent, or a sequence of insertelement and
>> extract element (which is what we get for the attached code), or leave it as
>> is? (or any better ideas?)
>> 
>> Forgive my naivete if there's something obvious I'm missing since I haven't
>> done much w.r.t. vectors in LLVM.
> 
> shufflevector does look more canonical. In the past I think we avoided
> creating shufflevector for fear of producing bad code in CodeGen, but
> I think Chandler just fixed that :-)

Excellent!

> 
> Cheers,
> Rafael
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141008/95030760/attachment.html>


More information about the llvm-dev mailing list