[llvm] [AMDGPU] Form V_MAD_U64_U32 from mul24/mulhi24 (PR #72393)

Thu Nov 16 01:49:48 PST 2023

jayfoad wrote:

> About doing this in CGP, I asked @arsenm earlier and he suggested to fix it in ISel rather than teach CGP. I tend to agree - CGP doesn't always have the full picture/full knowledge of what the DAG can do, and it may be non-obvious to fix in CGP. Having a new pattern is simpler and more stable, IMO

You might both be right (I'm not super familiar with CGP) but I don't really understand why. On a "full rate doubles" machine (i.e. SIDPFullSpeedModel or SIDPGFX940FullSpeedModel), 24-bit multiplies are no faster than 32-bit, so why would CGP need to do anything at all?

https://github.com/llvm/llvm-project/pull/72393