[PATCH] D134151: [AMDGPU][GFX11] Mitigate VALU mask write hazard

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 19 11:09:06 PDT 2022


rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:707
 
+def FeatureVALUMaskWriteHazard : SubtargetFeature<"valu-mask-write-hazard",
+  "HasVALUMaskWriteHazard",
----------------
You do not really need a feature bit for this, just a subtarget check.


================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:2719
+
+  const MachineOperand *SDSTOp = TII.getNamedOperand(*MI, AMDGPU::OpName::sdst);
+  if (!SDSTOp || !SDSTOp->isReg())
----------------
You seem to start from the instruction in the middle and have no check for the 3rd instruction. This makes the w/a overrestrictive as far as I understand. This maybe fine, but probably deserves a comment.


================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:2725
+  if (HazardReg == AMDGPU::EXEC ||
+      HazardReg == AMDGPU::EXEC_LO ||
+      HazardReg == AMDGPU::EXEC_HI ||
----------------
A mask can only be a 64 bit register in wave64. You could check the size and bail rather then listing 32 bit registers.


================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:2732
+    switch (I.getOpcode()) {
+    case AMDGPU::V_ADDC_U32_e32:
+    case AMDGPU::V_CNDMASK_B16_e32:
----------------
Isn't that less error prone to just check for an implicit vcc use? I assume it shall not be true unless it reads a mask.


================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:2741
+      return HazardReg == AMDGPU::VCC ||
+             HazardReg == AMDGPU::VCC_LO ||
+             HazardReg == AMDGPU::VCC_HI;
----------------
As far as I understand this can only be a VCC too.


================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:2804
+  // FIXME: we could select VGPR/SGPR used to avoid unintended dependency delays
+  BuildMI(*MI->getParent(), MI, MI->getDebugLoc(),
+          TII.get(AMDGPU::V_XOR3_B32_e64), AMDGPU::VGPR0)
----------------
Call `runOnInstruction()` on this new instruction.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134151/new/

https://reviews.llvm.org/D134151



More information about the llvm-commits mailing list