[PATCH] D67662: [AMDGPU] SIFoldOperands should not fold register acrocc the EXEC definition

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 23 11:15:37 PDT 2019


alex-t updated this revision to Diff 221368.
alex-t added a comment.
Herald added a subscriber: mgorny.

LoopFinder class was employed to detect if the register copy source cannot be folded.
The idea is to use uniform method to detect the exec mask merge necessity and the COPY movement/folding restrictions.
LoopFinder was refactored out of the SILowerI1Copies to AMDGPU/Utils for this purpose.
addLoopEntries method was refactored from the LoopFinder to SILowerI1Copies 
because it changes the Machine Function and naturally belongs to the pass that is the ancestor of MachineFunctionPass.
LoopFinder is the analysis that should not change the MachineFunction.

Unfortunately LoopFinder does not store the detected loop's exits. So we cannot check if they modify EXEC.
So we'd opt for over-conservative but correct approach.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67662/new/

https://reviews.llvm.org/D67662

Files:
  llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
  llvm/lib/Target/AMDGPU/SILowerI1Copies.cpp
  llvm/lib/Target/AMDGPU/Utils/AMDGPULoopHelpers.cpp
  llvm/lib/Target/AMDGPU/Utils/AMDGPULoopHelpers.h
  llvm/lib/Target/AMDGPU/Utils/CMakeLists.txt

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D67662.221368.patch
Type: text/x-patch
Size: 19958 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190923/d68c58d8/attachment.bin>


More information about the llvm-commits mailing list