[llvm] [AMDGPU] Form V_MAD_U64_U32 from mul24 (PR #72393)
Pierre van Houtryve via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 8 04:00:48 PST 2023
================
@@ -676,6 +676,16 @@ multiclass IMAD32_Pats <VOP3_Pseudo inst> {
(ThreeOpFragSDAG<mul, add> i32:$src0, i32:$src1, (i32 imm:$src2)),
(EXTRACT_SUBREG (inst $src0, $src1, (i64 (as_i64imm $src2)), 0 /* clamp */), sub0)
>;
+
+ // Handle cases where amdgpu-codegenprepare-mul24 made a mul24 instead of a normal mul.
+ def : GCNPat <
+ (i64 (add (i64 (AMDGPUmul_u24 i32:$src0, i32:$src1)), i64:$src2)),
+ (inst $src0, $src1, $src2, 0 /* clamp */)
+ >;
+ def : GCNPat <
+ (i64 (add (i64 (zext (i32 (AMDGPUmul_u24 i32:$src0, i32:$src1)))), i64:$src2)),
----------------
Pierre-vh wrote:
Should we do an extract_subreg and fill the upper 32 bits with zero in the result instruction then?
https://github.com/llvm/llvm-project/pull/72393
More information about the llvm-commits
mailing list