[llvm] [AMDGPU] Form V_MAD_U64_U32 from mul24/mulhi24 (PR #72393)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 15 06:49:59 PST 2023

jayfoad wrote:

> CGP can transform a fine mul+add into a (mul24/mulhi24)+add, so add a pattern for that.

Typo "fine"? Not sure what you meant.

This would depend on the relative rate of mul_u24 vs mad_u64. On older ASICs, mad_u64 is "quarter rate" so two fast mul_u24 instructions should be faster. I see that gfx90a uses SIDPFullSpeedModel so mad_u64 is as fast as mul_u24.

In any case would it be better to teach CGP not to do the harmful transformation int he first place, rather than work around it in isel?


More information about the llvm-commits mailing list