[PATCH] D59626: [AMDGPU] Add MachineDCE pass after RenameIndependentSubregs

Mon Apr 8 10:29:45 PDT 2019

rampitec added a comment.

Bjorn,

did you enable MachineDCE in RA in some other target? So far it is only enabled in the AMDGPU, so I am curious where do you see these problems?

In D59626#1458186 <https://reviews.llvm.org/D59626#1458186>, @bjope wrote:

> I see a couple of problems with the changes to DeadMachineInstructionElim.cpp in this patch:
>
> - I get MachineVerifier problems because the LIS update does not take into consideration that the removal of a dead use can introduce disjoint live ranges (resulting in verifier complaining about "Multiple connected components in live interval".

I am recomputing liveness for all affected uses, how could that be?

> - I get MachineVerifier problems because the LIS update does not update live ranges for physical registers. Consider an instruction like this `dead %42:regs = COPY $a1`, if that is removed and it is the last use of the physical register `$a1` then we need to shrink the live range for `$a1` , otherwise the MachineVerifier will complain about "Live segment doesn't end at a valid instruction".

That is probably right. I avoid deleting dead phys, but not their uses. I think shrinkToUses() will be appropriate here.

> - I do not think the `!MRI->reg_empty(Reg)` check takes debug info into consideration. Possibly introducing a debug invariance problem.

I thought about this, so I did not use reg_nodbg_empty() in this instance. I.e. this LI gets recreated.

> - It does not seem like this pass is using the LIS itself. So why do we even bother keeping it update? It is destroyed, and then it is up to the pass manager to detect if it needs to be recalculated by a later pass, right? Or do we know that it will be used by a later pass, and thereby we want to bother about preserving it (even if it isn't for the greater good of the pass itself). If it is like that, then we probably need to add some code comments to highlight that we only do it because some other pass needs it (for example in case that pass is removed in the future).

The pass does not use it, but should update if it exists. That is true, if I do not claim it is preserved then it will be recalculated, but that is very slow.
So far only AMDGPU should be using it with liveness available. Am I missing some other invocation?

> - Is it really true that we now preserve all possible analyses (both existing and future, both upstream and downstream), motivating the change from setPreservesCFG() to setPreservesAll(). I mean, the pass is pretty "destructive" as it removes instructions. You have only added (afaict incomplete) support to preserve the LIS. Notice that I'm not sure exactly what the difference between "PreservesAll" and "PreservesCFG" is, but I suppose it includes more analysis results besides LIS (and SlotIndices).

It does preserve LIS and SlotIndexes. The rest of the analysis passes (like PDT) are just unaffected by the pass itself. I believe that is more or less the same assumption for all passes which use PreservesAll() that we are considering exiting analysis passes. Otherwise this method is not future proof in general.

> Maybe we can back out the change in DeadMachineInstructionElim.cpp until the above problems have been sorted out?

If there is no better way. If you see the problems do you mind to share an IR?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59626/new/

https://reviews.llvm.org/D59626