[PATCH] D67662: [AMDGPU] SIFoldOperands should not fold register acrocc the EXEC definition
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 23 12:18:15 PDT 2019
rampitec added a comment.
In D67662#1679473 <https://reviews.llvm.org/D67662#1679473>, @alex-t wrote:
> LoopFinder class was employed to detect if the register copy source cannot be folded.
> The idea is to use uniform method to detect the exec mask merge necessity and the COPY movement/folding restrictions.
> LoopFinder was refactored out of the SILowerI1Copies to AMDGPU/Utils for this purpose.
> addLoopEntries method was refactored from the LoopFinder to SILowerI1Copies
> because it changes the Machine Function and naturally belongs to the pass that is the ancestor of MachineFunctionPass.
> LoopFinder is the analysis that should not change the MachineFunction.
>
> Unfortunately LoopFinder does not store the detected loop's exits. So we cannot check if they modify EXEC.
> So we'd opt for over-conservative but correct approach.
Building a PDT for every invocation of the SIFoldOperands seems a too heavy hummer for a too small problem.
I would rather just call execMayBeModifiedBeforeUse() and bail.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D67662/new/
https://reviews.llvm.org/D67662
More information about the llvm-commits
mailing list