[PATCH] D13364: [x86] PR24562: fix incorrect folding of X86ISD::PSHUFB nodes that have a mask of all indices with the most significant bit set.

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 4 06:38:54 PDT 2015


andreadb added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:21966
@@ -21966,3 +21965,3 @@
   if (Mask.size() == 1) {
-    DCI.CombineTo(Root.getNode(), DAG.getBitcast(RootVT, Input),
-                  /*AddTo*/ true);
+    // We may end up with an accumulated mask of size 1 as a result of
+    // widening of shuffle operands (see function canWidenShuffleElements).
----------------
RKSimon wrote:
> Can we assert here that the single Mask index is either SentinelZero or a valid element index? Undef should never get here correct?
Hi Simon,
We can't assert that a Mask index is either SentinelZero or a valid element index.

Consider the following example:

```
define <2 x i64> @TestUndef() {
entry:
  %0 = call <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, <16 x i8> <i8 0, i8 1, i8 undef, i8 undef, i8 4, i8 5, i8 undef, i8 undef, i8 8, i8 9, i8 undef, i8 undef, i8 2, i8 3, i8 undef, i8 undef>)
  %1 = call <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8> %0, <16 x i8> <i8 undef, i8 undef, i8 2, i8 3, i8 undef, i8 undef, i8 6, i8 7, i8 undef, i8 undef, i8 2, i8 3, i8 undef, i8 undef, i8 14, i8 15>)
  %2 = bitcast <16 x i8> %1 to <2 x i64>
  ret <2 x i64> %2
}

declare <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>)

```

Function PerformShuffleCombine is able to recursively look through chains of shuffles.
In this example, it would firstly decode the PSHUFB mask for %1 (the second pshufb in the sequence) using the functionalities in X86ShuffleDecode.cpp. For each decoded index in the mask, it would check if the index points to an element of another shuffle. If so, then it would look through that shuffle and propagate the "combined" shuffle mask.

In our example, mask Index 2 of %1 points to the element at position 2 of %0.
PerformShuffleCombine would decode %0' shuffle mask and eventually realize that element 2 of %0 is 'Undef'. As a result, it would replace the 2nd shuffle mask index of %1 with SM_SentinelUndef.

In this example, all the shuffle indices of %1 are eventually replaced with SM_SentinelUndef. Later on, function 'canWidenShuffleElements' shrinks the shuffle mask to a mask with a single SM_SentinelUndef. That mask would be invalid according to the assertion you suggests.

So, to answer to your question, Undef can reach this code.


http://reviews.llvm.org/D13364





More information about the llvm-commits mailing list