[llvm] r243519 - [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
Hans Wennborg
hans at chromium.org
Wed Jul 29 08:59:10 PDT 2015
Merged in r243528.
Thanks,
Hans
On Wed, Jul 29, 2015 at 7:39 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> Hi Hans,
>
> Please pull this into the release branch.
>
> -Hal
>
> ----- Original Message -----
>> From: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
>> To: llvm-commits at cs.uiuc.edu
>> Sent: Wednesday, July 29, 2015 9:31:58 AM
>> Subject: [llvm] r243519 - [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
>>
>> Author: wschmidt
>> Date: Wed Jul 29 09:31:57 2015
>> New Revision: 243519
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=243519&view=rev
>> Log:
>> [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
>>
>> Given certain shuffle-vector masks, LLVM emits splat instructions
>> which splat the wrong bytes from the source register. The issue is
>> that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp
>> does not ensure that the splat pattern found is requesting bytes that
>> are aligned on an EltSize boundary. This patch detects this
>> situation
>> as not a valid splat mask, resulting in a permute being generated
>> instead of a splat.
>>
>> Patch and test case by Tyler Kenney, cleaned up a bit by me.
>>
>> This is a simple bug fix that would be good to incorporate into 3.7.
>>
>> Added:
>> llvm/trunk/test/CodeGen/PowerPC/pr24216.ll
>> Modified:
>> llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>>
>> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=243519&r1=243518&r2=243519&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
>> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Wed Jul 29
>> 09:31:57 2015
>> @@ -1430,6 +1430,11 @@ bool PPC::isSplatShuffleMask(ShuffleVect
>> assert(N->getValueType(0) == MVT::v16i8 &&
>> (EltSize == 1 || EltSize == 2 || EltSize == 4));
>>
>> + // The consecutive indices need to specify an element, not part of
>> two
>> + // different elements. So abandon ship early if this isn't the
>> case.
>> + if (N->getMaskElt(0) % EltSize != 0)
>> + return false;
>> +
>> // This is a splat operation if each element of the permute is the
>> same, and
>> // if the value doesn't reference the second vector.
>> unsigned ElementBase = N->getMaskElt(0);
>>
>> Added: llvm/trunk/test/CodeGen/PowerPC/pr24216.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/pr24216.ll?rev=243519&view=auto
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/PowerPC/pr24216.ll (added)
>> +++ llvm/trunk/test/CodeGen/PowerPC/pr24216.ll Wed Jul 29 09:31:57
>> 2015
>> @@ -0,0 +1,14 @@
>> +; RUN: llc -mcpu=pwr8 -mtriple=powerpc64le-unknown-linux-gnu < %s |
>> FileCheck %s
>> +
>> +; Test case adapted from PR24216.
>> +
>> +define void @foo(<16 x i8>* nocapture readonly %in, <16 x i8>*
>> nocapture %out) {
>> +entry:
>> + %0 = load <16 x i8>, <16 x i8>* %in, align 16
>> + %1 = shufflevector <16 x i8> %0, <16 x i8> undef, <16 x i32> <i32
>> 2, i32 3, i32 4, i32 5, i32 2, i32 3, i32 4, i32 5, i32 2, i32 3,
>> i32 4, i32 5, i32 2, i32 3, i32 4, i32 5>
>> + store <16 x i8> %1, <16 x i8>* %out, align 16
>> + ret void
>> +}
>> +
>> +; CHECK: vperm
>> +; CHECK-NOT: vspltw
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
More information about the llvm-commits
mailing list