[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

Tony Tye via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 25 18:30:23 PST 2024


================
@@ -641,6 +644,9 @@ class SIMemoryLegalizer final : public MachineFunctionPass {
   bool expandAtomicCmpxchgOrRmw(const SIMemOpInfo &MOI,
                                 MachineBasicBlock::iterator &MI);
 
+  bool GFX9InsertWaitcntForPreciseMem(MachineFunction &MF);
----------------
t-tye wrote:

Should these be combined with the expand* functions? They are supposed to do all that is necessary to "legalize" the opcodes to meet the memory model. And this inserting waitcnts is just another piece of that expansion.

Combining it can also avoid inserting multiple waitcnt for the same memory operation.

Combining it may be able to use the existing operation to ensure a memory operation is completed. I believe that operation should already be determining what kind of waitcnts should be inserted. If not, then I would consider generalizing it so it can be used by both the atomics expansion and the precise memory expansion.

It also keeps the operations in this class architecture neutral.

https://github.com/llvm/llvm-project/pull/79236


More information about the cfe-commits mailing list