[PATCH] D14588: [X86][SSE] Transform truncation from v8i32/v16i32 to v8i8/v16i8 into bitand and X86ISD::PACKUS operations during DAG combine.

Cong Hou via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 30 12:23:46 PST 2015


congh added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26229
@@ +26228,3 @@
+  // On AVX2, the behavior of X86ISD::PACKUS is different from that on SSE2 and
+  // we could not benefit from this method.
+  // AVX512 provides vpmovdb.
----------------
RKSimon wrote:
> I understand that packus(ymm) won't do what we want but - won't AVX2 still benefit for cases where packus(xmm) is used? Why not just early out if (!hasSSE2 || VT.getSizeInBits() > 128) ?
I have tested the instructions generated on AVX2 by using your suggested method and found we can save one instruction for v16i32 to v16i8 truncation. However, I found in this case we could still use ymm to get better results: what we need to do is doing packus on ymm but later shuffle the result. This can save us more than 40% instructions . I think this can be a follow-up patch.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26230
@@ +26229,3 @@
+  // we could not benefit from this method.
+  // AVX512 provides vpmovdb.
+  if (!Subtarget->hasSSE2() || Subtarget->hasAVX2())
----------------
RKSimon wrote:
> Please can you add AVX512 as a test target to prove that its using vpmovdb etc.
OK. I have updated the test file.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26269
@@ +26268,3 @@
+      SubVec[i / 2] = DAG.getNode(ISD::BITCAST, DL, MVT::v4i32, SubVec[i / 2]);
+    }
+    SubVec.resize(RegNum / 2);
----------------
RKSimon wrote:
> Please don't use domain switches, they can cause massive stalls on pipes. Why not just use DAG.getVectorShuffle()?
Thanks for pointing out this problem! Unfortunately, there is no similar instruction as shufps for integers. Therefore, I could not find any better solution for vXi64 to vXi16 truncation and hence have removed this case from this patch.


http://reviews.llvm.org/D14588





More information about the llvm-commits mailing list