[PATCH] D156350: [X86] Allow pre-SSE41 targets to extract multiple v16i8 elements coming from the same DWORD/WORD super-element
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 31 05:18:24 PDT 2023
pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.
LGTM.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:20713
int WordIdx = IdxVal / 2;
- SDValue Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i16,
- DAG.getBitcast(MVT::v8i16, Vec),
- DAG.getIntPtrConstant(WordIdx, dl));
- int ShiftVal = (IdxVal % 2) * 8;
- if (ShiftVal != 0)
- Res = DAG.getNode(ISD::SRL, dl, MVT::i16, Res,
- DAG.getConstant(ShiftVal, dl, MVT::i8));
- return DAG.getNode(ISD::TRUNCATE, dl, VT, Res);
+ if (DemandedElts == (DemandedElts & (3 << (WordIdx * 2)))) {
+ SDValue Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i16,
----------------
RKSimon wrote:
> pengfei wrote:
> > It's not clear to me here, the old code should have more chance to generate SRL than the new code due to the restriction. Which one it better? I didn't find a case to reflect the difference.
> I'm not sure I understand? The original code was always limited to a single extract, and for odd-indices (greater than 3) PEXTRW+SRL would be used. The change just means that we allow a pair of extracts (odd + even) that share a PEXTRW - so we still allow the SRL case.
I see the point, thanks!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D156350/new/
https://reviews.llvm.org/D156350
More information about the llvm-commits
mailing list