[llvm] r229311 - [X86][AVX2] vpslldq/vpsrldq byte shifts for AVX2

Chandler Carruth chandlerc at google.com
Tue Feb 17 23:17:32 PST 2015


On Tue, Feb 17, 2015 at 7:55 PM, Chandler Carruth <chandlerc at google.com>
wrote:

> On Sun, Feb 15, 2015 at 5:19 AM, Simon Pilgrim <llvm-dev at redking.me.uk>
> wrote:
>
>> Author: rksimon
>> Date: Sun Feb 15 07:19:52 2015
>> New Revision: 229311
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=229311&view=rev
>> Log:
>> [X86][AVX2] vpslldq/vpsrldq byte shifts for AVX2
>>
>> This patch refactors the existing lowerVectorShuffleAsByteShift function
>> to add support for 256-bit vectors on AVX2 targets.
>>
>
> Unfortunately, this patch introduces a nasty miscompile. See below.
>
>
>>
>> It also fixes a tablegen issue that prevented the lowering of
>> vpslldq/vpsrldq vec256 instructions.
>>
>
> And because of this, it's a bit hard to disentangle. It would be better to
> keep these separate in the future (and yea, i know, i should have said this
> in the review...)
>
> +  int NumElts = VT.getVectorNumElements();
>> +  int NumLanes = VT.getSizeInBits() / 128;
>> +  int NumLaneElts = NumElts / NumLanes;
>> +  int Scale = 16 / NumLaneElts;
>> +  MVT ShiftVT = MVT::getVectorVT(MVT::i64, 2 * NumLanes);
>> +
>> +  // PSLLDQ : (little-endian) left byte shift
>> +  // [ zz,  0,  1,  2,  3,  4,  5,  6]
>> +  // [ zz, zz, -1, -1,  2,  3,  4, -1]
>> +  // [ zz, zz, zz, zz, zz, zz, -1,  1]
>> +  // PSRLDQ : (little-endian) right byte shift
>> +  // [  5, 6,  7, zz, zz, zz, zz, zz]
>> +  // [ -1, 5,  6,  7, zz, zz, zz, zz]
>> +  // [  1, 2, -1, -1, -1, -1, zz, zz]
>> +  auto MatchByteShift = [&](int Shift) -> SDValue {
>> +    bool MatchLeft = true, MatchRight = true;
>> +    for (int l = 0; l < NumElts; l += NumLaneElts) {
>> +      for (int i = 0; i < Shift; ++i)
>> +        MatchLeft &= Zeroable[l + i];
>> +      for (int i = NumLaneElts - Shift; i < NumLaneElts; ++i)
>> +        MatchRight &= Zeroable[l + i];
>>      }
>> +    if (!(MatchLeft || MatchRight))
>> +      return SDValue();
>>
>
>
>> +    bool MatchV1 = true, MatchV2 = true;
>> +    for (int l = 0; l < NumElts; l += NumLaneElts) {
>> +      unsigned Pos = MatchLeft ? Shift + l : l;
>> +      unsigned Low = MatchLeft ? l : Shift + l;
>>
>
> What happens if both MatchLeft and MatchRight are true? You don't assert
> that this can't happen, and in fact, it can. We only test one of them here.
>
>
>> +      unsigned Len = NumLaneElts - Shift;
>> +      MatchV1 &= isSequentialOrUndefInRange(Mask, Pos, Len, Low);
>> +      MatchV2 &= isSequentialOrUndefInRange(Mask, Pos, Len, Low +
>> NumElts);
>>      }
>> -  }
>> +    if (!(MatchV1 || MatchV2))
>> +      return SDValue();
>> +
>> +    int ByteShift = Shift * Scale;
>> +    unsigned Op = MatchRight ? X86ISD::VSRLDQ : X86ISD::VSHLDQ;
>>
>
> And then here, we use MatchRight. Boom.
>
> Here is a test case:
>
> define internal fastcc <8 x i16> @test(<8 x i16> %s.0.0, <8 x i16> %s.0.1)
> noinline nounwind {
> entry:
>   %s.1.0 = shufflevector <8 x i16> %s.0.0, <8 x i16> %s.0.1, <8 x i32>
> <i32 15, i32 undef, i32 3, i32 undef, i32 12, i32 12, i32 5, i32 undef>
>
>   ret <8 x i16> %s.1.0
> }
>
> This gets miscompiled at top-of-tree.
>
> I tried reverting this patch, and it's a mess. Doesn't revert cleanly. =[
> When I manually tried to fix the revert, the above test case got
> miscompiled *differently* by shifting every byte out. =[
>
> I'm probably just going to rip this function out until you or I have time
> to look at re-adding it and ensuring it doesn't have any holes. I don't see
> any other clean way of "reverting to green" here because the patches don't
> really back out cleanly.
>

Ripping this out was super ugly -- lots of stuff relies on this now.

I took a stab at rewriting it in r229642. This passes my test case and has
so far survived my fuzz testing. I'm not finding any other miscompiles
either, so things are looking really good now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150217/0765e845/attachment.html>


More information about the llvm-commits mailing list