[llvm] [TwoAddressInstruction] Propagate undef flags for partial defs (PR #79286)

Carl Ritson via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 24 05:15:41 PST 2024


https://github.com/perlfu created https://github.com/llvm/llvm-project/pull/79286

If part of a register (lowered from REG_SEQUENCE) is undefined then we should propagate undef flags to uses of those lanes. This is only performed when live intervals are present as it requires live intervals to correctly match uses to defs, and the primary goal is to allow precise computation of subrange intervals.

>From 19bba42a9cc4e0f473d95795a40a9333d179950a Mon Sep 17 00:00:00 2001
From: Carl Ritson <carl.ritson at amd.com>
Date: Wed, 24 Jan 2024 22:02:09 +0900
Subject: [PATCH] [TwoAddressInstruction] Propagate undef flags for partial
 defs

If part of a register (lowered from REG_SEQUENCE) is undefined then
we should propagate undef flags to uses of those lanes.
This is only performed when live intervals are present as it
requires live intervals to correctly match uses to defs, and the
primary goal is to allow precise computation of subrange intervals.
---
 .../lib/CodeGen/TwoAddressInstructionPass.cpp | 30 +++++++++++++++----
 .../AMDGPU/GlobalISel/extractelement.ll       |  1 +
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp b/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
index 74d7904aee33a2d..0b8bef24d1a7bb6 100644
--- a/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+++ b/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
@@ -1929,21 +1929,26 @@ eliminateRegSequence(MachineBasicBlock::iterator &MBBI) {
   Register DstReg = MI.getOperand(0).getReg();
 
   SmallVector<Register, 4> OrigRegs;
+  VNInfo *DefVN = nullptr;
   if (LIS) {
     OrigRegs.push_back(MI.getOperand(0).getReg());
     for (unsigned i = 1, e = MI.getNumOperands(); i < e; i += 2)
       OrigRegs.push_back(MI.getOperand(i).getReg());
+    if (LIS->hasInterval(DstReg)) {
+      DefVN = LIS->getInterval(DstReg).Query(
+        LIS->getInstructionIndex(MI)).valueOut();
+    }
   }
 
+  LaneBitmask UndefLanes = LaneBitmask::getNone();
   bool DefEmitted = false;
-  bool DefIsPartial = false;
   for (unsigned i = 1, e = MI.getNumOperands(); i < e; i += 2) {
     MachineOperand &UseMO = MI.getOperand(i);
     Register SrcReg = UseMO.getReg();
     unsigned SubIdx = MI.getOperand(i+1).getImm();
     // Nothing needs to be inserted for undef operands.
     if (UseMO.isUndef()) {
-      DefIsPartial = true;
+      UndefLanes |= TRI->getSubRegIndexLaneMask(SubIdx);
       continue;
     }
 
@@ -1991,11 +1996,24 @@ eliminateRegSequence(MachineBasicBlock::iterator &MBBI) {
       MI.removeOperand(j);
   } else {
     if (LIS) {
-      // Force interval recomputation if we moved from full definition
-      // of register to partial.
-      if (DefIsPartial && LIS->hasInterval(DstReg) &&
-          MRI->shouldTrackSubRegLiveness(DstReg))
+      // Force live interval recomputation if we moved to a partial defintion
+      // of the register.  Undef flags must be propagate to uses of undefined
+      // subregister for accurate interval computation.
+      if (UndefLanes.any() && DefVN && MRI->shouldTrackSubRegLiveness(DstReg)) {
+        auto &LI = LIS->getInterval(DstReg);
+        for (MachineOperand &UseOp : MRI->use_operands(DstReg)) {
+          unsigned SubReg = UseOp.getSubReg();
+          if (UseOp.isUndef() || !SubReg)
+            continue;
+          auto *VN = LI.getVNInfoAt(LIS->getInstructionIndex(*UseOp.getParent()));
+          if (DefVN != VN)
+            continue;
+          LaneBitmask LaneMask = TRI->getSubRegIndexLaneMask(SubReg);
+          if ((UndefLanes & LaneMask).any())
+            UseOp.setIsUndef(true);
+        }
         LIS->removeInterval(DstReg);
+      }
       LIS->RemoveMachineInstrFromMaps(MI);
     }
 
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
index ac153183be642a2..3f11c122a681463 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,GPRIDX %s
+; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx900 -early-live-intervals -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,GPRIDX %s
 ; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,MOVREL %s
 ; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx1010 -verify-machineinstrs < %s | FileCheck -check-prefixes=GFX10PLUS,GFX10 %s
 ; RUN: llc -global-isel -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx1100 -amdgpu-enable-delay-alu=0 -verify-machineinstrs < %s | FileCheck -check-prefixes=GFX10PLUS,GFX11 %s



More information about the llvm-commits mailing list