[llvm] [X86] Avoid useless DomTree in flags copy lowering (PR #97628)
Alexis Engelke via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 3 12:50:13 PDT 2024
https://github.com/aengelke created https://github.com/llvm/llvm-project/pull/97628
Currently, flags copy lowering does two expensive things:
- It traverses the CFG in RPO, and
- It requires a dominator tree that is not preserved.
Most notably, it is the only machine dominator tree user at -O0.
Many functions have no flag copies to begin with, therefore, add an early exit if EFLAGS has no COPY def.
The legacy pass manager has no way to dynamically decide whether an analysis is required. Therefore, if there's a copy, get the dominator tree from the pass manager, if it has one, otherwise, compute it.
These changes should make the pass very cheap for the common case.
>From b47f5cf9ef5337f9b4599b04366302b7caf9e861 Mon Sep 17 00:00:00 2001
From: Alexis Engelke <engelke at in.tum.de>
Date: Wed, 3 Jul 2024 21:39:22 +0200
Subject: [PATCH] [X86] Avoid useless DomTree in flags copy lowering
Currently, flags copy lowering does two expensive things:
- It traverses the CFG in RPO, and
- It requires a dominator tree that is not preserved.
Most notably, it is the only machine dominator tree user at -O0.
Many functions have no flag copies to begin with, therefore, add an
early exit if EFLAGS has no COPY def.
The legacy pass manager has no way to dynamically decide whether an
analysis is required. Therefore, if there's a copy, get the dominator
tree from the pass manager, if it has one, otherwise, compute it.
These changes should make the pass very cheap for the common case.
---
llvm/lib/Target/X86/X86FlagsCopyLowering.cpp | 31 ++++++++++++++++++--
1 file changed, 28 insertions(+), 3 deletions(-)
diff --git a/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp b/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
index 394947bc65c89..d9ed1334d7376 100644
--- a/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
+++ b/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
@@ -128,7 +128,7 @@ FunctionPass *llvm::createX86FlagsCopyLoweringPass() {
char X86FlagsCopyLoweringPass::ID = 0;
void X86FlagsCopyLoweringPass::getAnalysisUsage(AnalysisUsage &AU) const {
- AU.addRequired<MachineDominatorTreeWrapperPass>();
+ AU.addUsedIfAvailable<MachineDominatorTreeWrapperPass>();
MachineFunctionPass::getAnalysisUsage(AU);
}
@@ -258,13 +258,38 @@ bool X86FlagsCopyLoweringPass::runOnMachineFunction(MachineFunction &MF) {
MRI = &MF.getRegInfo();
TII = Subtarget->getInstrInfo();
TRI = Subtarget->getRegisterInfo();
- MDT = &getAnalysis<MachineDominatorTreeWrapperPass>().getDomTree();
PromoteRC = &X86::GR8RegClass;
if (MF.empty())
// Nothing to do for a degenerate empty function...
return false;
+ bool HasCopies = false;
+ for (const MachineInstr &DefInst : MRI->def_instructions(X86::EFLAGS)) {
+ if (DefInst.getOpcode() == TargetOpcode::COPY) {
+ HasCopies = true;
+ break;
+ }
+ }
+
+ if (!HasCopies)
+ return false;
+
+ // We change the code, so we don't preserve the dominator tree anyway. If we
+ // got a valid MDT from the pass manager, use that, otherwise construct one
+ // now. This is an optimization that avoids unnecessary MDT construction for
+ // functions that have no flag copies.
+
+ auto MDTWrapper = getAnalysisIfAvailable<MachineDominatorTreeWrapperPass>();
+ std::unique_ptr<MachineDominatorTree> OwnedMDT;
+ if (MDTWrapper) {
+ MDT = &MDTWrapper->getDomTree();
+ } else {
+ OwnedMDT = std::make_unique<MachineDominatorTree>();
+ OwnedMDT->getBase().recalculate(MF);
+ MDT = OwnedMDT.get();
+ }
+
// Collect the copies in RPO so that when there are chains where a copy is in
// turn copied again we visit the first one first. This ensures we can find
// viable locations for testing the original EFLAGS that dominate all the
@@ -688,7 +713,7 @@ bool X86FlagsCopyLoweringPass::runOnMachineFunction(MachineFunction &MF) {
}
#endif
- return true;
+ return !Copies.empty();
}
/// Collect any conditions that have already been set in registers so that we
More information about the llvm-commits
mailing list