[PATCH] D67662: [AMDGPU] SIFoldOperands should not fold register acrocc the EXEC definition

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 23 12:18:15 PDT 2019


rampitec added a comment.

In D67662#1679473 <https://reviews.llvm.org/D67662#1679473>, @alex-t wrote:

> LoopFinder class was employed to detect if the register copy source cannot be folded.
>  The idea is to use uniform method to detect the exec mask merge necessity and the COPY movement/folding restrictions.
>  LoopFinder was refactored out of the SILowerI1Copies to AMDGPU/Utils for this purpose.
>  addLoopEntries method was refactored from the LoopFinder to SILowerI1Copies 
>  because it changes the Machine Function and naturally belongs to the pass that is the ancestor of MachineFunctionPass.
>  LoopFinder is the analysis that should not change the MachineFunction.
>
> Unfortunately LoopFinder does not store the detected loop's exits. So we cannot check if they modify EXEC.
>  So we'd opt for over-conservative but correct approach.


Building a PDT for every invocation of the SIFoldOperands seems a too heavy hummer for a too small problem.
I would rather just call execMayBeModifiedBeforeUse() and bail.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67662/new/

https://reviews.llvm.org/D67662





More information about the llvm-commits mailing list