[PATCH] D156350: [X86] Allow pre-SSE41 targets to extract multiple v16i8 elements coming from the same DWORD/WORD super-element
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 31 02:41:26 PDT 2023
pengfei added inline comments.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:20695
+ // TODO: Add QWORD MOVQ extraction?
+ if (VT.getSizeInBits() == 8) {
+ APInt DemandedElts = getExtractedDemandedElts(Vec.getNode());
----------------
Why we use `getSizeInBits` rather than check for `i8`?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:20713
int WordIdx = IdxVal / 2;
- SDValue Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i16,
- DAG.getBitcast(MVT::v8i16, Vec),
- DAG.getIntPtrConstant(WordIdx, dl));
- int ShiftVal = (IdxVal % 2) * 8;
- if (ShiftVal != 0)
- Res = DAG.getNode(ISD::SRL, dl, MVT::i16, Res,
- DAG.getConstant(ShiftVal, dl, MVT::i8));
- return DAG.getNode(ISD::TRUNCATE, dl, VT, Res);
+ if (DemandedElts == (DemandedElts & (3 << (WordIdx * 2)))) {
+ SDValue Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i16,
----------------
It's not clear to me here, the old code should have more chance to generate SRL than the new code due to the restriction. Which one it better? I didn't find a case to reflect the difference.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D156350/new/
https://reviews.llvm.org/D156350
More information about the llvm-commits
mailing list