[PATCH] D37389: [AMDGPU] Produce madak and madmk from the two-address pass
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 6 15:26:05 PDT 2017
rampitec added inline comments.
================
Comment at: test/CodeGen/AMDGPU/madak.ll:40
; GCN: s_endpgm
define amdgpu_kernel void @madak_2_use_f32(float addrspace(1)* noalias %out, float addrspace(1)* noalias %in) nounwind {
%tid = tail call i32 @llvm.amdgcn.workitem.id.x() nounwind readnone
----------------
arsenm wrote:
> Should add a test with 3 uses. We should consider not doing it for > 2 uses if optsize
We shall always use madak/madmk instead of v_mad_f32. Size of the instruction is the same: mad is VOP3, madak/madmk are VOP2 + literal. I.e. 64 bit VOP3 vs 32 bit VOP2 + 32 bit literal.
At the same time we can save a register for the literal and move into that register. So even for the optsize we shall prefer these.
https://reviews.llvm.org/D37389
More information about the llvm-commits
mailing list