[llvm] [X86] X86LowerTileCopy - Find dead register to use to prevent save-reload (PR #83628)
Phoebe Wang via llvm-commits
llvm-commits at lists.llvm.org
Sat Apr 20 02:50:45 PDT 2024
================
@@ -73,10 +73,16 @@ FunctionPass *llvm::createX86LowerTileCopyPass() {
bool X86LowerTileCopy::runOnMachineFunction(MachineFunction &MF) {
const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();
const X86InstrInfo *TII = ST.getInstrInfo();
+ const TargetRegisterInfo *TRI = ST.getRegisterInfo();
+ BitVector GR64Regs =
+ TRI->getAllocatableSet(MF, TRI->getRegClass(X86::GR64RegClassID));
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
- for (MachineInstr &MI : llvm::make_early_inc_range(MBB)) {
+ LiveRegUnits UsedRegs(*TRI);
+ UsedRegs.addLiveOuts(MBB);
+ for (MachineInstr &MI : llvm::make_early_inc_range(reverse(MBB))) {
+ UsedRegs.stepBackward(MI);
----------------
phoebewang wrote:
I assume the cost here is low, otherwise, it may less efficient that iterate after we find a tile copy instruction.
https://github.com/llvm/llvm-project/pull/83628
More information about the llvm-commits
mailing list