[llvm] [AMDGPU] Form V_MAD_U64_U32 from mul24/mulhi24 (PR #72393)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 17 00:05:38 PST 2023
arsenm wrote:
> You might both be right (I'm not super familiar with CGP) but I don't really understand why. On a "full rate doubles" machine (i.e. SIDPFullSpeedModel or SIDPGFX940FullSpeedModel), 24-bit multiplies are no faster than 32-bit, so why would CGP need to do anything at all?
I believe they also use less power. Plus better to not depend on the specific speed model, the more common looking ISA the better
https://github.com/llvm/llvm-project/pull/72393
More information about the llvm-commits
mailing list