[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)
Pierre van Houtryve via cfe-commits
cfe-commits at lists.llvm.org
Sun Jan 28 22:41:12 PST 2024
================
@@ -2561,6 +2567,70 @@ bool SIMemoryLegalizer::expandAtomicCmpxchgOrRmw(const SIMemOpInfo &MOI,
return Changed;
}
+bool SIMemoryLegalizer::GFX9InsertWaitcntForPreciseMem(MachineFunction &MF) {
+ const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
+ const SIInstrInfo *TII = ST.getInstrInfo();
+ IsaVersion IV = getIsaVersion(ST.getCPU());
+
+ bool Changed = false;
+
+ for (auto &MBB : MF) {
+ for (auto MI = MBB.begin(); MI != MBB.end();) {
+ MachineInstr &Inst = *MI;
+ ++MI;
+ if (Inst.mayLoadOrStore() == false)
+ continue;
+
+ // Todo: if next insn is an s_waitcnt
+ AMDGPU::Waitcnt Wait;
+
+ if (!(Inst.getDesc().TSFlags & SIInstrFlags::maybeAtomic)) {
+ if (TII->isSMRD(Inst)) { // scalar
----------------
Pierre-vh wrote:
Can we have a shared helper, e.g. in `SIInstrInfo` for both? It's a lot of logic to duplicate
> The counter values in SIInsertWaitcnt are precise, while in this features the counters are simply set to 0.
That could just be a boolean switch in a shared helper
https://github.com/llvm/llvm-project/pull/79236
More information about the cfe-commits
mailing list