[PATCH] D37389: [AMDGPU] Produce madak and madmk from the two-address pass

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 6 15:26:05 PDT 2017


rampitec added inline comments.


================
Comment at: test/CodeGen/AMDGPU/madak.ll:40
 ; GCN: s_endpgm
 define amdgpu_kernel void @madak_2_use_f32(float addrspace(1)* noalias %out, float addrspace(1)* noalias %in) nounwind {
   %tid = tail call i32 @llvm.amdgcn.workitem.id.x() nounwind readnone
----------------
arsenm wrote:
> Should add a test with 3 uses. We should consider not doing it for > 2 uses if optsize
We shall always use madak/madmk instead of v_mad_f32. Size of the instruction is the same: mad is VOP3, madak/madmk are VOP2 + literal. I.e. 64 bit VOP3 vs 32 bit VOP2 + 32 bit literal.

At the same time we can save a register for the literal and move into that register. So even for the optsize we shall prefer these.


https://reviews.llvm.org/D37389





More information about the llvm-commits mailing list