[llvm] [X86] Generate `vpmuludq` instead of `vpmullq` (PR #121456)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 1 23:38:11 PST 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-x86
Author: None (abhishek-kaushik22)
<details>
<summary>Changes</summary>
When lowering `_mm512_mul_epu32` intrinsic if the generated value if later used in a vector shuffle we generate `vpmullq` instead of `vpmuludq` (https://godbolt.org/z/WbaGMqs8e) because `SimplifyDemandedVectorElts` simplifies the arguments and we fail the combine to `PMULDQ`.
Added an override to `shouldSimplifyDemandedVectorElts` in `X86TargetLowering` to check if we can combine the `MUL` to `PMULDQ` first.
---
Full diff: https://github.com/llvm/llvm-project/pull/121456.diff
2 Files Affected:
- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+21)
- (modified) llvm/lib/Target/X86/X86ISelLowering.h (+3)
``````````diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index a0514e93d6598b..e104264bcbf918 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -60832,3 +60832,24 @@ Align X86TargetLowering::getPrefLoopAlignment(MachineLoop *ML) const {
return Align(1ULL << ExperimentalPrefInnermostLoopAlignment);
return TargetLowering::getPrefLoopAlignment();
}
+
+bool X86TargetLowering::shouldSimplifyDemandedVectorElts(
+ SDValue Op, const TargetLoweringOpt &TLO) const {
+ if (Op.getOpcode() == ISD::VECTOR_SHUFFLE) {
+ SDValue V0 = peekThroughBitcasts(Op.getOperand(0));
+ SDValue V1 = peekThroughBitcasts(Op.getOperand(1));
+
+ if (V0.getOpcode() == ISD::MUL || V1.getOpcode() == ISD::MUL) {
+ SDNode *Mul = V0.getOpcode() == ISD::MUL ? V0.getNode() : V1.getNode();
+ SelectionDAG &DAG = TLO.DAG;
+ const X86Subtarget &Subtarget = DAG.getSubtarget<X86Subtarget>();
+ const SDLoc DL(Mul);
+
+ if (SDValue V = combineMulToPMULDQ(Mul, DL, DAG, Subtarget)) {
+ DAG.ReplaceAllUsesWith(Mul, V.getNode());
+ return false;
+ }
+ }
+ }
+ return true;
+}
diff --git a/llvm/lib/Target/X86/X86ISelLowering.h b/llvm/lib/Target/X86/X86ISelLowering.h
index 2b7a8eaf249d83..0a6cd53f557bb2 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.h
+++ b/llvm/lib/Target/X86/X86ISelLowering.h
@@ -1207,6 +1207,9 @@ namespace llvm {
bool hasBitTest(SDValue X, SDValue Y) const override;
+ bool shouldSimplifyDemandedVectorElts(
+ SDValue Op, const TargetLoweringOpt &TLO) const override;
+
bool shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd(
SDValue X, ConstantSDNode *XC, ConstantSDNode *CC, SDValue Y,
unsigned OldShiftOpcode, unsigned NewShiftOpcode,
``````````
</details>
https://github.com/llvm/llvm-project/pull/121456
More information about the llvm-commits
mailing list