[llvm] [X86] Attempt to use VPMADD52L/VPMULUDQ instead of VPMULLQ on slow VPMULLQ targets (or when VPMULLQ is unavailable) (PR #171760)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 10 20:56:54 PST 2025
github-actions[bot] wrote:
<!--LLVM CODE FORMAT COMMENT: {clang-format}-->
:warning: C/C++ code formatter, clang-format found issues in your code. :warning:
<details>
<summary>
You can test this locally with the following command:
</summary>
``````````bash
git-clang-format --diff origin/main HEAD --extensions cpp -- llvm/lib/Target/X86/X86ISelLowering.cpp --diff_from_common_commit
``````````
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
</details>
<details>
<summary>
View the diff from clang-format here.
</summary>
``````````diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 346af7d60..2ad5ee6f6 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -49938,15 +49938,15 @@ static SDValue combineMul(SDNode *N, SelectionDAG &DAG,
unsigned Count1 = Known1.countMinLeadingZeros();
// Optimization 1: Use VPMULUDQ (32-bit multiply).
- // If the upper 32 bits are zero, we can use the standard PMULUDQ instruction.
- // This is generally the fastest option and widely supported.
+ // If the upper 32 bits are zero, we can use the standard PMULUDQ
+ // instruction. This is generally the fastest option and widely supported.
if (Count0 >= 32 && Count1 >= 32) {
return DAG.getNode(X86ISD::PMULUDQ, DL, VT, Op0, Op1);
}
// Optimization 2: Use VPMADD52L (52-bit multiply-add).
- // On targets with slow VPMULLQ (e.g., Ice Lake),
- //VPMADD52L is significantly faster (lower latency/better throughput).
+ // On targets with slow VPMULLQ (e.g., Ice Lake),
+ // VPMADD52L is significantly faster (lower latency/better throughput).
if (Subtarget.hasAVX512() && Subtarget.hasIFMA()) {
if (Count0 >= 12 && Count1 >= 12) {
SDValue Zero = getZeroVector(VT.getSimpleVT(), Subtarget, DAG, DL);
``````````
</details>
https://github.com/llvm/llvm-project/pull/171760
More information about the llvm-commits
mailing list