[llvm] [AMDGPU] Consider FLAT instructions for VMEM hazard detection (PR #137170)

Wed May 7 02:11:43 PDT 2025

================
@@ -1417,9 +1417,8 @@ static bool shouldRunLdsBranchVmemWARHazardFixup(const MachineFunction &MF,
   bool HasVmem = false;
   for (auto &MBB : MF) {
     for (auto &MI : MBB) {
-      HasLds |= SIInstrInfo::isDS(MI);
-      HasVmem |= (SIInstrInfo::isVMEM(MI) && !SIInstrInfo::isFLAT(MI)) ||
-                 SIInstrInfo::isSegmentSpecificFLAT(MI);
+      HasLds |= SIInstrInfo::isDS(MI) || SIInstrInfo::isLDSDMA(MI);
+      HasVmem |= SIInstrInfo::isVMEM(MI) && !SIInstrInfo::isLDSDMA(MI);
----------------
ro-i wrote:

Hm, I currently don't have a deep understanding either. While working on the other PR, I just noticed that the current implementation only considers FLAT instruction targeting Global/Scratch memory and I thought that it might make more sense to exclude LDS DMA instead.
However, I'm not sure what is actually supported by the ISA. I mean, we have an LDS bit in the FLAT instruction encoding which is described as "0 = normal, 1 = transfer data between LDS and memory instead of VGPRs and memory", which would suggest that no vgprs are touched. On the other hand, `FLAT_Global_Load_LDS_Pseudo` still uses `VM_CNT` and has `SchedRW = [WriteVMEM, WriteLDS]`.
(Note, though, that this change introduces *more* synchronization in the current tests)

https://github.com/llvm/llvm-project/pull/137170