[PATCH] D156350: [X86] Allow pre-SSE41 targets to extract multiple v16i8 elements coming from the same DWORD/WORD super-element

Phoebe Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 31 05:18:24 PDT 2023


pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.



================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:20713
     int WordIdx = IdxVal / 2;
-    SDValue Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i16,
-                              DAG.getBitcast(MVT::v8i16, Vec),
-                              DAG.getIntPtrConstant(WordIdx, dl));
-    int ShiftVal = (IdxVal % 2) * 8;
-    if (ShiftVal != 0)
-      Res = DAG.getNode(ISD::SRL, dl, MVT::i16, Res,
-                        DAG.getConstant(ShiftVal, dl, MVT::i8));
-    return DAG.getNode(ISD::TRUNCATE, dl, VT, Res);
+    if (DemandedElts == (DemandedElts & (3 << (WordIdx * 2)))) {
+      SDValue Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i16,
----------------
RKSimon wrote:
> pengfei wrote:
> > It's not clear to me here, the old code should have more chance to generate SRL than the new code due to the restriction. Which one it better? I didn't find a case to reflect the difference.
> I'm not sure I understand? The original code was always limited to a single extract, and for odd-indices (greater than 3) PEXTRW+SRL would be used. The change just means that we allow a pair of extracts (odd + even) that share a PEXTRW - so we still allow the SRL case. 
I see the point, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156350/new/

https://reviews.llvm.org/D156350



More information about the llvm-commits mailing list