[PATCH] D127253: [AMDGPU] Use v_mad_u64_u32 for IMAD32

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 9 01:53:16 PDT 2022


foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.

LGTM, thanks. But we should really do gisel too, even if it means writing C++ code.



================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:413
 
-class ThreeOpFrag<SDPatternOperator op1, SDPatternOperator op2> : PatFrag<
+class ThreeOpFragSDAG<SDPatternOperator op1, SDPatternOperator op2> : PatFrag<
   (ops node:$x, node:$y, node:$z),
----------------
Just curious: how does this work? This class has no GISelPredicateCode, but isn't that the same as having a GISelPredicateCode that always returns true?


================
Comment at: llvm/test/CodeGen/AMDGPU/mad_u64_u32.ll:46
+; GFX9-NEXT:    s_mov_b32 s0, 42
+; GFX9-NEXT:    v_mad_u64_u32 v[0:1], s[0:1], v0, v1, s[0:1]
+; GFX9-NEXT:    ; return to shader part epilog
----------------
rampitec wrote:
> rampitec wrote:
> > foad wrote:
> > > As a follow up it would be nice if we could fold the "42" into the mad as an inline src2.
> > Sure. I will take a look why didn't it fold in a followup.
> This actually did not fold because of the IMPLICIT_DEF. This is technically not a constant, but a 64 bit partial undef:
> ```
>   %3:sreg_32 = S_MOV_B32 42
>   %5:sreg_32 = IMPLICIT_DEF
>   %4:sreg_64 = REG_SEQUENCE killed %3:sreg_32, %subreg.sub0, killed %5:sreg_32, %subreg.sub1
>   %7:vreg_64, %8:sreg_64 = V_MAD_U64_U32_e64 %0:vgpr_32, %1:vgpr_32, killed %4:sreg_64, 0, implicit $exec
> ```
Thanks! Can you add a test where the constant is too large to go inline, if there isn't one already? I guess this can be a literal on GFX10 but not on GFX9.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127253/new/

https://reviews.llvm.org/D127253



More information about the llvm-commits mailing list