[llvm] [NVPTX] Enhance `mul.wide` and `mad.wide` peepholes (PR #150477)
Adrian Kuegel via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 15 05:12:24 PDT 2025
akuegel wrote:
@justinfargnoli
It looks like that this part of the change is causing performance regressions for us:
"Implements (add (mul.wide a, b), c) -> (mad.wide a, b, c) in instruction selection."
I checked that if I remove these patterns the performance is recovered:
```
defm MAD_WIDE_U32 : MAD_WIDE<"u32", mul_wide_unsigned_oneuse, I64RT, I32RT>;
defm MAD_WIDE_S32 : MAD_WIDE<"s32", mul_wide_signed_oneuse, I64RT, I32RT>;
```
I am attaching the before.ptx and after.ptx, maybe it helps to figure out by looking at the generated sass why it may be slower?
[before.ptx.txt](https://github.com/user-attachments/files/21794739/before.ptx.txt)
[after.ptx.txt](https://github.com/user-attachments/files/21794740/after.ptx.txt)
https://github.com/llvm/llvm-project/pull/150477
More information about the llvm-commits
mailing list