[PATCH] D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence.

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Apr 7 06:40:09 PDT 2019


nhaehnle added a comment.

>From the point of view of the design of all these interface, It's too bad we can't fix this in post. From an overall standpoint, it's actually better to get the register classes from the beginning, so sure, let's go with this kind of approach.

After a bit of reflection, I think that in part what's happening here is that the uniform/divergent axis and the register bank axis is getting confused. See @efriedma's question and my comment on `isDivergentRegClass`. I wonder if some parts of this change would not be better expressed in terms of register banks. For example, in the uses of `isDivergentRegClass`, aren't we really looking for another register class in the same bank (SGPR vs. VGPR)? Similarly, maybe `requiresUniformRegister` can be rephrased as returning a required register bank (and nullptr by default)?

Why do we still need to move PHIs to VALU after this change? (Looking at SIFixSGPRCopies) Shouldn't the PHI be selected with the correct register class already? What's an example where this doesn't happen?

How are values handled which are uniform inside a loop but divergent for outside uses due to a divergent exit condition?



================
Comment at: include/llvm/CodeGen/TargetRegisterInfo.h:524-526
+  virtual bool isDivergentRegClass(const TargetRegisterClass *RC) const {
+    return false;
+  }
----------------
This function is problematic because we can't actually tell for a given register class whether the underlying value is divergent or not. Specifically, 64-bit SGPRs can be either uniform or divergent depending on whether it's the lowering of an i1 or an i64.



================
Comment at: lib/CodeGen/SelectionDAG/InstrEmitter.cpp:555
                                  Node->getOperand(0).getSimpleValueType(),
-                                 Node->getDebugLoc());
-
+          Node->isDivergent(), Node->getDebugLoc());
       // Create the destreg if it is missing.
----------------
The formatting looks off here.


================
Comment at: lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp:726
 
+    OS << " t" << PersistentId;
+
----------------
This looks like it belongs into a separate patch.


================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:630
+          unsigned Reg = Instr->getOperand(0).getReg();
+          for (auto &Use : MRI.use_operands(Reg)) {
+            MachineInstr *UseMI = Use.getParent();
----------------
const auto &


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:9997-9998
+  const SIRegisterInfo *TRI = Subtarget->getRegisterInfo();
+  if (RC == &AMDGPU::VReg_1RegClass && !isDivergent)
+    return &AMDGPU::SReg_64RegClass;
+  if (!TRI->isSGPRClass(RC) && !isDivergent)
----------------
Can we not keep uniform i1 in 32-bit registers? Maybe at least mark this as a TODO?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59990/new/

https://reviews.llvm.org/D59990





More information about the llvm-commits mailing list