[PATCH] D69280: [AMDGPU] Allow folding of sgpr to vgpr copy

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 21 14:51:02 PDT 2019


rampitec marked an inline comment as done.
rampitec added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/fmul-2-combine-multi-use.ll:79-80
 ; SIVI:  v_mad_f32 {{v[0-9]+}}, |[[X]]|, 2.0, v{{[0-9]+}}
-; GFX10: v_fma_f32 {{v[0-9]+}}, |[[X:s[0-9]+]]|, 2.0, {{s[0-9]+}}
-; GFX10: v_fma_f32 {{v[0-9]+}}, |[[X]]|, 2.0, {{s[0-9]+}}
+; GFX10: v_fma_f32 {{v[0-9]+}}, 2.0, |[[X:s[0-9]+]]|, {{v[0-9]+}}
+; GFX10: v_fma_f32 {{v[0-9]+}}, 2.0, |[[X]]|, {{v[0-9]+}}
 define amdgpu_kernel void @multiple_use_fadd_multi_fmad_f32(float addrspace(1)* %out, float %x, float %y, float %z) #0 {
----------------
rampitec wrote:
> arsenm wrote:
> > This looks like it got worse?
> Yes, this is regression specific to fma/mac. The reg class after the folding mismatches xm0/xexec operand definition of fma src.
> The regression is however small, while some copies are eliminated in other cases.
I.e. we should refine how we use sgpr register classes instead of inhibiting folding.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69280/new/

https://reviews.llvm.org/D69280





More information about the llvm-commits mailing list