[PATCH] D89618: [AMDGPU] Optimize waitcnt insertion for flat memory operations
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 20 14:24:31 PDT 2020
rampitec added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/waitcnt.mir:103
+ $vgpr3 = FLAT_LOAD_DWORD $vgpr1_vgpr2, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4 from %ir.flat4)
+ $vgpr4 = FLAT_LOAD_DWORD $vgpr1_vgpr2, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (load 4 from %ir.global4)
+ $vgpr0 = V_MOV_B32_e32 $vgpr3, implicit $exec
----------------
Can you keep just load from flat here? The other load obscures the result.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D89618/new/
https://reviews.llvm.org/D89618
More information about the llvm-commits
mailing list