[llvm] [X86] X86LowerTileCopy: Find dead register to use to prevent save-reload of tile register (PR #83628)

Phoebe Wang via llvm-commits llvm-commits at lists.llvm.org
Sun Apr 21 00:36:43 PDT 2024


================
@@ -72,10 +73,16 @@ FunctionPass *llvm::createX86LowerTileCopyPass() {
 bool X86LowerTileCopy::runOnMachineFunction(MachineFunction &MF) {
   const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();
   const X86InstrInfo *TII = ST.getInstrInfo();
+  const TargetRegisterInfo *TRI = ST.getRegisterInfo();
+  BitVector GR64Regs =
+      TRI->getAllocatableSet(MF, TRI->getRegClass(X86::GR64RegClassID));
   bool Changed = false;
 
   for (MachineBasicBlock &MBB : MF) {
-    for (MachineInstr &MI : llvm::make_early_inc_range(MBB)) {
+    LiveRegUnits UsedRegs(*TRI);
+    UsedRegs.addLiveOuts(MBB);
----------------
phoebewang wrote:

I think we can early out the loop by checking tile registers, e.g.,

```
  BitVector TILERegs =
      TRI->getAllocatableSet(MF, TRI->getRegClass(X86::TILERegClassID));
  bool Changed = false;

  for (MachineBasicBlock &MBB : MF) {
    LiveRegUnits UsedRegs(*TRI);
    UsedRegs.addLiveOuts(MBB);
    if (TILERegs.anyCommon(UsedRegs.getBitVector())
      continue;
```

https://github.com/llvm/llvm-project/pull/83628


More information about the llvm-commits mailing list