[PATCH] D50633: [AMDGPU] Add new Mode Register pass

Mon Nov 12 06:58:46 PST 2018

timcorringham added a comment.

Amended SIModeRegister to address some minor points, and added comments to help explain why it appears more complex than necessary.

================
Comment at: lib/Target/AMDGPU/SIModeRegister.cpp:90-92
+  // The Status that represents the mode register settings required on entry to
+  // this block. Calculated in Phase 1.
+  Status Require;
----------------
nhaehnle wrote:
> Technically not required on entry, but required at the FirstInsertionPoint.
Yes, I forgot to change this comment when I changed the code to insert at the FirstInsertionPoint rather than at the start of the block. 

================
Comment at: lib/Target/AMDGPU/SIModeRegister.cpp:237-257
+      } else if (InsertionPoint) {
+        // We need a setreg at the InsertionPoint, but we defer the first in
+        // each block to Phase 3
+        if (RequirePending) {
+          NewInfo->Require = NewInfo->Change;
+          RequirePending = false;
+        } else {
----------------
nhaehnle wrote:
> I feel like this logic is more convoluted than necessary, and possibly even wrong / overly conservative in some cases because of that.
> 
> For example, why are you setting InsertionPoint also in the case where FirstInsertionPoint is set? Why are IPChange and NewInfo->Change never cleared? My intuition is that the logic should just be a delayed-update/insert pattern like this:
> ```
> if (!NewInfo->FirstInsertionPoint) {
>   NewInfo->FirstInsertionPoint = &MI;
> } else if (!IPChange.isCompatible(InstrMode)) {
>   if (InsertionPoint)
>     insertSetReg(IPChange);
>   else
>     NewInfo->Require = IPChange;
>   merge IPChange into NewInfo->Change and reset IPChange
>   InsertionPoint = &MI;
> }
> merge InstrMode into IPChange
> ```
> Then at the end of the basic block:
> ```
> if (!InsertionPoint)
>   NewInfo->Require = IPChange;
> ```
> The RequirePending flag should simply be unnecessary.
There are cases where we don't set FirstInsertionPoint, even if there are other InsertionPoints. This can arise where an explicit setreg appears before the first instruction that uses the mode register. We  preserve the setreg (we don't really expect to see any, but if they appear we assume there is a good reason for it). In that case there is no initial mode value requirement, so no FirstInsertionPoint.  That is also the reason for the RequirePending flag - there are more states than can be deduced by just the InsertionPoint and FirstInsertionPoint pointers.
We don't clear Change as the algorithm assumes it holds the net change to the mode by the block. When we know the predecessor mode(s) in Phase 2 we can then determine the output mode of each block (this can involve revisiting blocks that are successors to any block that changes its output mode).
Phase 3 then determines whether a setreg is required at the FirstInsertionPoiint.

================
Comment at: lib/Target/AMDGPU/SIModeRegister.cpp:272-276
+      if (InsertionPoint) {
+        // We need to insert a setreg at the InsertionPoint
+        insertSetreg(MBB, InsertionPoint, TII, IPChange.delta(NewInfo->Change));
+        InsertionPoint = nullptr;
+      }
----------------
nhaehnle wrote:
> It would be more intuitive to guard this by IPChange being "non-empty". In the InsertionPoint case, insert the SETREG; otherwise, set NewInfo->Require. In both cases, IPChange should be reset.
> 
> The overall logic can then be described as:
> * NewInfo->Change describes the current status of the mode registers as we know it
> * IPChange describes the pending mode changes that need to be applied at InsertionPoint (if non-null) or NewInfo->Require (otherwise)
The status values aren't designed to work quite the way you assume. I have refactored the code slightly and improved the comments - does that help at all?

Repository:
  rL LLVM

https://reviews.llvm.org/D50633