[PATCH] D138044: AMDGPU/GlobalISel: Fix crash after mad/fma_mix fails selection

Petar Avramovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 16 06:48:18 PST 2022


Petar.Avramovic added a comment.

I was checking if test like this gets selected in the same way:

  define amdgpu_vs float @test_f16_f32_add_fma_ext_mul(float inreg %x, float %y, float %z, half %u, half %v) {
  .entry:
      %a = fmul half %u, %v
      %b = fpext half %a to float
      %abs_x = call contract float @llvm.fabs.f32(float %x)
      %c = call float @llvm.fmuladd.f32(float %abs_x, float %y, float %b)
      %d = fadd float %c, %z
      ret float %d
  }
  
  declare float @llvm.fmuladd.f32(float, float, float)
  declare float @llvm.fabs.f32(float)

selects into

  %0:sreg_32 = COPY $sgpr0
  %1:vgpr_32 = COPY $vgpr0
  %2:vgpr_32 = COPY $vgpr1
  %5:vgpr_32 = COPY $vgpr2
  %6:vgpr_32 = COPY $vgpr3
  %8:vgpr_32 = nofpexcept V_MUL_F16_e64 0, %5, 0, %6, 0, 0, implicit $mode, implicit $exec
  %14:vgpr_32 = COPY %0
  %11:vgpr_32 = V_FMA_MIX_F32 2, %14, 0, %1, 8, %8, 0, 0, 0, implicit $mode, implicit $exec
  %12:vgpr_32 = nofpexcept V_ADD_F32_e64 0, %11, 0, %2, 0, 0, implicit $mode, implicit $exec
  $vgpr0 = COPY %12
  SI_RETURN_TO_EPILOG implicit $vgpr0

Previous version of this patch would not make a copy and it would use sgpr directly. Visible only in mir. sifoldoperands folds this copy later anyway.

  %11:vgpr_32 = V_FMA_MIX_F32 2, %0, 0, %1, 8, %8, 0, 0, 0, implicit $mode, implicit $exec


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138044/new/

https://reviews.llvm.org/D138044



More information about the llvm-commits mailing list