[llvm-dev] [LLVMdev] Question on BlendSplat Code - LLVM Commit 72753f87f2b80d66cfd7ca7c7b6c0db6737d4b24

Tue Aug 4 23:31:26 PDT 2015

Hi Tyler,

First, as a procedural note, we always refer to commits by their svn revision number, what is the corresponding svn revision number to 72753f87f2b80d66cfd7ca7c7b6c0db6737d4b24 on the git mirror?

Second, as I read your message, you sound skeptical about the utility of the transformation, even for x86. That seems unjustified, however, because running your test case through llc -mtriple=x86_64 -mcpu=corei7-avx generates vpunpckhbw and vpmovzxbw (which seem to be the two corresponding shuffle instructions).

That having been said, Chandler, could you please explain the general strategy that x86 uses here? We obviously might wish to emulate it in the PowerPC backend.

 -Hal

----- Original Message -----
> From: "Tyler Kenney" <tjkenney at us.ibm.com>
> To: chandlerc at gmail.com
> Cc: "Ulrich Weigand" <Ulrich.Weigand at de.ibm.com>, llvmdev at cs.uiuc.edu
> Sent: Thursday, July 30, 2015 9:29:07 AM
> Subject: [LLVMdev] Question on BlendSplat Code - LLVM Commit 72753f87f2b80d66cfd7ca7c7b6c0db6737d4b24
> 
> 
> 
> 
> 
> Hey Chandler,
> 
> 
> I'm working on a modification of the Power LLVM backend and I have
> some questions about the 'BlendSplat' code in
> SelectionDAG::GetVectorShuffle(). Basically, I'm wondering if you
> can give a little more detail about the goal of this function? It
> seems like your code is increasing the chances of the mask matching
> the subsequent checks for an identity shuffle or all LHS/RHS, which
> is clearly beneficial. Are you also claiming the altered mask is
> easier to match, even if it's not caught by those special cases?
> 
> 
> I attached a tarball with .cl & .ll source for one case where the
> altered mask seems much more difficult to match; the shufflevector
> instruction in the IR is a fairly straightforward interleave of two
> variables, but your blend code eliminates this pattern when building
> the dag. Like I said, I'm targetting power here, so I want the
> shufflevector instructions to match vmrghb & vmrglb. I'm assuming
> x86 has similar instructions? Is the altered mask in the .ps file
> really easier to match on x86? I attached the power assembly
> generated for this function with & without the blendsplat code and I
> think its clear that, at least in the case of power, the altered
> mask is not preferable. Agreed? I'd like to understand the intent of
> your code better so I can either (a) figure out how to properly
> avoid modification of the mask in this case or (b) invert this
> modification in the power backend so we can match this to vmrg*
> instructions and avoid the use of vperm.
> 
> 
> Thanks,
> Tyler
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory