[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.

Thomas Symalla via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 16 05:01:56 PST 2022


tsymalla added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:583-594
+    for (MachineBasicBlock &MBB : MF) {
+      for (MachineInstr &MI : MBB) {
+        // Try to record existing s_and_saveexec instructions, iff
+        // they are reading from a v_cmp dest SGPR write.
+        if (MI.getOpcode() != AndSaveExecOpcode)
+          continue;
+
----------------
nhaehnle wrote:
> The S_AND_SAVEEXEC_Bnn instruction has to be one of the last instructions of the basic block. It actually gets derived from a terminator instruction, though for some reason we apparently never changed the S_xxx_SAVEEXEC instructions to be terminators as well.
> 
> In any case, what you should do here from a compile-time perspective is to iterate basic blocks backwards and limit the lop to a small number of iterations. Maybe 5?
> 
> Also, this loop should break if it finds a different instruction that writes EXEC (shouldn't happen in practice, but...).
Looking at the other changes, this transformation should only be applied when it doesn't break correctness or doesn't introduce overhead (spilling). In the next revision, the find instruction looks at the instructions of a basic block backwards from a given s_and_saveexec instruction and breaks whenever one of these cases appear. I am going to change that again so that it only iterates a maximum amount of instructions backwards when trying to find the compare instruction which should reduce the work to do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119696/new/

https://reviews.llvm.org/D119696



More information about the llvm-commits mailing list