[LLVMdev] Canonicalizing vector masking.

Sean Silva chisophugis at gmail.com
Thu Oct 9 17:45:17 PDT 2014


On Wed, Oct 8, 2014 at 10:45 AM, Nadav Rotem <nrotem at apple.com> wrote:

> I think that the pattern below should be canonicalized into a vector
> ’select’ instruction with a constant mask.  I think that we already have
> code for canonicalizing select-like shuffles into selects.
>
> On Oct 6, 2014, at 12:36 PM, Rafael Espíndola <rafael.espindola at gmail.com>
> wrote:
>
> On 26 September 2014 19:22, Sean Silva <chisophugis at gmail.com> wrote:
>
> Hi, I received an internal test case from a game team (it wasn't about this
> in particular), and I was wondering if there was maybe an opportunity to
> canonicalize a particular code pattern:
>
>  %inputi = bitcast <4 x float> %input to <4 x i32>
>
>  %row0i = and <4 x i32> %inputi, <i32 -1, i32 0, i32 0, i32 0>
>  %row0 = bitcast <4 x i32> %row0i to <4 x float>
>
>  %row1i = and <4 x i32> %inputi, <i32 0, i32 -1, i32 0, i32 0>
>  %row1 = bitcast <4 x i32> %row1i to <4 x float>
>
>  %row2i = and <4 x i32> %inputi, <i32 0, i32 0, i32 -1, i32 0>
>  %row2 = bitcast <4 x i32> %row2i to <4 x float>
>
>  %row3i = and <4 x i32> %inputi, <i32 0, i32 0, i32 0, i32 -1>
>  %row3 = bitcast <4 x i32> %row3i to <4 x float>
>
> This arises from code which expands a vector of scale factors into the
> diagonal of a 4x4 diagonal matrix. This code pattern is coming from
> intrinsics which are explicitly doing the masking like this.
>
> My question is: should we canonicalize this to:
>
>  %row0 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
> x i32> <i32 0, i32 4, i32 4, i32 4>
>  %row1 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
> x i32> <i32 4, i32 1, i32 4, i32 4>
>  %row2 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
> x i32> <i32 4, i32 4, i32 2, i32 4>
>  %row3 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
> x i32> <i32 4, i32 4, i32 4, i32 3>
>
>
> I think that there is a bug in the shuffle pattern. It should be <i32 4,
> i32 5, i32 6, i32 3>.
>

Aren't 4, 5, and 6 all just elements of the zeroinitializer? I just used
4,4,4 which should be the same semantically.

FWIW, it is trying to take

input = {x,y,z,w}

and output

row0 = {x,0,0,0}
row1 = {0,y,0,0}
row2 = {0,0,z,0}
row3 = {0,0,0,w}

-- Sean Silva


>
> which seems to better express the intent, or a sequence of insertelement
> and
> extract element (which is what we get for the attached code), or leave it
> as
> is? (or any better ideas?)
>
> Forgive my naivete if there's something obvious I'm missing since I haven't
> done much w.r.t. vectors in LLVM.
>
>
> shufflevector does look more canonical. In the past I think we avoided
> creating shufflevector for fear of producing bad code in CodeGen, but
> I think Chandler just fixed that :-)
>
>
> Excellent!
>
>
> Cheers,
> Rafael
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141009/9e8f5b91/attachment.html>


More information about the llvm-dev mailing list