[llvm] 5db6eac - [X86] Avoid useless DomTree in flags copy lowering (#97628)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 4 07:41:12 PDT 2024
Author: Alexis Engelke
Date: 2024-07-04T16:41:08+02:00
New Revision: 5db6eac244bd42aaefd0caac3f824b2e60060f52
URL: https://github.com/llvm/llvm-project/commit/5db6eac244bd42aaefd0caac3f824b2e60060f52
DIFF: https://github.com/llvm/llvm-project/commit/5db6eac244bd42aaefd0caac3f824b2e60060f52.diff
LOG: [X86] Avoid useless DomTree in flags copy lowering (#97628)
Currently, flags copy lowering does two expensive things:
- It traverses the CFG in RPO, and
- It requires a dominator tree that is not preserved.
Most notably, it is the only machine dominator tree user at -O0.
Many functions have no flag copies to begin with, therefore, add an
early exit if EFLAGS has no COPY def.
The legacy pass manager has no way to dynamically decide whether an
analysis is required. Therefore, if there's a copy, get the dominator
tree from the pass manager, if it has one, otherwise, compute it.
These changes should make the pass very cheap for the common case.
Added:
Modified:
llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
llvm/test/CodeGen/X86/O0-pipeline.ll
llvm/test/CodeGen/X86/opt-pipeline.ll
Removed:
################################################################################
diff --git a/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp b/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
index 394947bc65c89..ab8b3dc3dd6d5 100644
--- a/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
+++ b/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
@@ -128,7 +128,7 @@ FunctionPass *llvm::createX86FlagsCopyLoweringPass() {
char X86FlagsCopyLoweringPass::ID = 0;
void X86FlagsCopyLoweringPass::getAnalysisUsage(AnalysisUsage &AU) const {
- AU.addRequired<MachineDominatorTreeWrapperPass>();
+ AU.addUsedIfAvailable<MachineDominatorTreeWrapperPass>();
MachineFunctionPass::getAnalysisUsage(AU);
}
@@ -258,13 +258,32 @@ bool X86FlagsCopyLoweringPass::runOnMachineFunction(MachineFunction &MF) {
MRI = &MF.getRegInfo();
TII = Subtarget->getInstrInfo();
TRI = Subtarget->getRegisterInfo();
- MDT = &getAnalysis<MachineDominatorTreeWrapperPass>().getDomTree();
PromoteRC = &X86::GR8RegClass;
if (MF.empty())
// Nothing to do for a degenerate empty function...
return false;
+ if (none_of(MRI->def_instructions(X86::EFLAGS), [](const MachineInstr &MI) {
+ return MI.getOpcode() == TargetOpcode::COPY;
+ }))
+ return false;
+
+ // We change the code, so we don't preserve the dominator tree anyway. If we
+ // got a valid MDT from the pass manager, use that, otherwise construct one
+ // now. This is an optimization that avoids unnecessary MDT construction for
+ // functions that have no flag copies.
+
+ auto MDTWrapper = getAnalysisIfAvailable<MachineDominatorTreeWrapperPass>();
+ std::unique_ptr<MachineDominatorTree> OwnedMDT;
+ if (MDTWrapper) {
+ MDT = &MDTWrapper->getDomTree();
+ } else {
+ OwnedMDT = std::make_unique<MachineDominatorTree>();
+ OwnedMDT->getBase().recalculate(MF);
+ MDT = OwnedMDT.get();
+ }
+
// Collect the copies in RPO so that when there are chains where a copy is in
// turn copied again we visit the first one first. This ensures we can find
// viable locations for testing the original EFLAGS that dominate all the
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 40648adeb91cd..ca855cfd1ad44 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -44,7 +44,6 @@
; CHECK-NEXT: Finalize ISel and expand pseudo-instructions
; CHECK-NEXT: Local Stack Slot Allocation
; CHECK-NEXT: X86 speculative load hardening
-; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: X86 EFLAGS copy lowering
; CHECK-NEXT: X86 DynAlloca Expander
; CHECK-NEXT: Fast Tile Register Preconfigure
diff --git a/llvm/test/CodeGen/X86/opt-pipeline.ll b/llvm/test/CodeGen/X86/opt-pipeline.ll
index 15c496bfb7f66..9bee9d0de88ae 100644
--- a/llvm/test/CodeGen/X86/opt-pipeline.ll
+++ b/llvm/test/CodeGen/X86/opt-pipeline.ll
@@ -125,7 +125,6 @@
; CHECK-NEXT: X86 Optimize Call Frame
; CHECK-NEXT: X86 Avoid Store Forwarding Block
; CHECK-NEXT: X86 speculative load hardening
-; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: X86 EFLAGS copy lowering
; CHECK-NEXT: X86 DynAlloca Expander
; CHECK-NEXT: MachineDominator Tree Construction
More information about the llvm-commits
mailing list