[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.

Carl Ritson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 20 19:54:47 PDT 2022


critson accepted this revision.
critson added a comment.
This revision is now accepted and ready to land.

LGTM, with a few minor nits.

Also please double check all comments in the code are up to date, as there has been a lot revisions.
Thanks!



================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:298
+// condition based on the current MachineInstr, for instance an target
+// instruction. Breaks prematurely by returning nullptr if DisallowDefBetween is
+// true and one of the registers given in NonModifiableRegs is modified by the
----------------
Comment mentions code that no longer exists `DisallowDefBetween`?


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:653
+  // to reduce pipeline stalls.
+  if (ST.hasGFX10_3Insts()) {
+    DenseMap<MachineInstr *, MachineInstr *> SaveExecVCmpMapping;
----------------
Could this be more future proof?
e.g.
`if (AMDGPU::isGFX10Plus(ST) && !ST.hasVcmpxExecWARHazard())`


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:659
+    for (MachineBasicBlock &MBB : MF) {
+      for (MachineInstr &MI : MBB) {
+        // Try to record existing s_and_saveexec instructions, iff
----------------
It's unfortunate that we have to scan all instructions to find these.
Normally they should be within an instruction or two of the terminators.
However, I can see that the waterfall code violates the principle of EXEC manipulation at the end of block right now, so we don't have a choice.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119696/new/

https://reviews.llvm.org/D119696



More information about the llvm-commits mailing list