[llvm] r243519 - [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask

Wed Jul 29 08:59:10 PDT 2015

Merged in r243528.

Thanks,
Hans

On Wed, Jul 29, 2015 at 7:39 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> Hi Hans,
>
> Please pull this into the release branch.
>
>  -Hal
>
> ----- Original Message -----
>> From: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
>> To: llvm-commits at cs.uiuc.edu
>> Sent: Wednesday, July 29, 2015 9:31:58 AM
>> Subject: [llvm] r243519 - [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
>>
>> Author: wschmidt
>> Date: Wed Jul 29 09:31:57 2015
>> New Revision: 243519
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=243519&view=rev
>> Log:
>> [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
>>
>> Given certain shuffle-vector masks, LLVM emits splat instructions
>> which splat the wrong bytes from the source register.  The issue is
>> that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp
>> does not ensure that the splat pattern found is requesting bytes that
>> are aligned on an EltSize boundary.  This patch detects this
>> situation
>> as not a valid splat mask, resulting in a permute being generated
>> instead of a splat.
>>
>> Patch and test case by Tyler Kenney, cleaned up a bit by me.
>>
>> This is a simple bug fix that would be good to incorporate into 3.7.
>>
>> Added:
>>     llvm/trunk/test/CodeGen/PowerPC/pr24216.ll
>> Modified:
>>     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>>
>> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=243519&r1=243518&r2=243519&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
>> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Wed Jul 29
>> 09:31:57 2015
>> @@ -1430,6 +1430,11 @@ bool PPC::isSplatShuffleMask(ShuffleVect
>>    assert(N->getValueType(0) == MVT::v16i8 &&
>>           (EltSize == 1 || EltSize == 2 || EltSize == 4));
>>
>> +  // The consecutive indices need to specify an element, not part of
>> two
>> +  // different elements.  So abandon ship early if this isn't the
>> case.
>> +  if (N->getMaskElt(0) % EltSize != 0)
>> +    return false;
>> +
>>    // This is a splat operation if each element of the permute is the
>>    same, and
>>    // if the value doesn't reference the second vector.
>>    unsigned ElementBase = N->getMaskElt(0);
>>
>> Added: llvm/trunk/test/CodeGen/PowerPC/pr24216.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/pr24216.ll?rev=243519&view=auto
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/PowerPC/pr24216.ll (added)
>> +++ llvm/trunk/test/CodeGen/PowerPC/pr24216.ll Wed Jul 29 09:31:57
>> 2015
>> @@ -0,0 +1,14 @@
>> +; RUN: llc -mcpu=pwr8 -mtriple=powerpc64le-unknown-linux-gnu < %s |
>> FileCheck %s
>> +
>> +; Test case adapted from PR24216.
>> +
>> +define void @foo(<16 x i8>* nocapture readonly %in, <16 x i8>*
>> nocapture %out) {
>> +entry:
>> +  %0 = load <16 x i8>, <16 x i8>* %in, align 16
>> +  %1 = shufflevector <16 x i8> %0, <16 x i8> undef, <16 x i32> <i32
>> 2, i32 3, i32 4, i32 5, i32 2, i32 3, i32 4, i32 5, i32 2, i32 3,
>> i32 4, i32 5, i32 2, i32 3, i32 4, i32 5>
>> +  store <16 x i8> %1, <16 x i8>* %out, align 16
>> +  ret void
>> +}
>> +
>> +; CHECK: vperm
>> +; CHECK-NOT: vspltw
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory