[PATCH] D76138: [X86][SSE] Prefer trunc(movd(x)) to pextrb(x,0)

Fri Mar 13 09:07:03 PDT 2020

RKSimon created this revision.
RKSimon added reviewers: craig.topper, spatel, lebedev.ri.
Herald added a subscriber: hiraditya.
Herald added a project: LLVM.

If we're extracting the 0'th index of a v16i8 vector we're better off using MOVD than PEXTRB, unless we're storing the value or we require the implicit zero extension of PEXTRB.

The biggest perf diff is on SLM targets where MOVD (uops=1, lat=3 tp=1) is notably faster than PEXTRB (uops=2, lat=5, uops=4).

This matches what we already do for PEXTRW.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D76138

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
===================================================================

--- llvm/lib/Target/X86/X86ISelLowering.cpp
+++ llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -17830,6 +17830,14 @@
     return SDValue();
 
   if (VT.getSizeInBits() == 8) {
+    // If IdxVal is 0, it's cheaper to do a move instead of a pextrb, unless
+    // we're going to zero extend the register or fold the store.
+    if (llvm::isNullConstant(Idx) && !MayFoldIntoZeroExtend(Op) &&
+        !MayFoldIntoStore(Op))
+      return DAG.getNode(ISD::TRUNCATE, dl, MVT::i8,
+                         DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i32,
+                                     DAG.getBitcast(MVT::v4i32, Vec), Idx));
+
     SDValue Extract = DAG.getNode(X86ISD::PEXTRB, dl, MVT::i32, Vec, Idx);
     return DAG.getNode(ISD::TRUNCATE, dl, VT, Extract);
   }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D76138.250225.patch
Type: text/x-patch
Size: 881 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200313/f0ab3720/attachment.bin>