[llvm] [AMDGPU] Allow folding to FMAAK with SGPR and immediate operand on GFX10+ (PR #72266)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 14 07:31:35 PST 2023


================
@@ -345,6 +345,6 @@ define amdgpu_ps float @s_fmaak_f32(float inreg %x, float inreg %y) {
 }
 
 ; GFX9: codeLenInByte = 20
-; GFX10: codeLenInByte = 16
-; GFX1100: codeLenInByte = 20
+; GFX10: codeLenInByte = 12
----------------
jayfoad wrote:

Nice illustration of the code size benefit here. The other potential benefit of fmaak over fmac is that it gives the register allocator the freedom to use different registers for the addend and the result - but often they end up in the same register anyway, so you don't get that benefit.

https://github.com/llvm/llvm-project/pull/72266


More information about the llvm-commits mailing list