[llvm] fd33001 - [CodeGen] Prevent overlapping subregs in getCoveringSubRegIndexes

Pierre van Houtryve via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 18 00:50:22 PST 2023


Author: Pierre van Houtryve
Date: 2023-01-18T03:50:17-05:00
New Revision: fd3300123de4fdc77cce94520d2cb58ea4c9122e

URL: https://github.com/llvm/llvm-project/commit/fd3300123de4fdc77cce94520d2cb58ea4c9122e
DIFF: https://github.com/llvm/llvm-project/commit/fd3300123de4fdc77cce94520d2cb58ea4c9122e.diff

LOG: [CodeGen] Prevent overlapping subregs in getCoveringSubRegIndexes

If `getCoveringSubRegIndexes` returns a set of subregister indexes where some subregisters overlap others, it can create unsatisfiable copy bundles that eventually cause VirtRegRewriter to error out due to "cycles in copy bundle".

We can simply prevent this by making the algorithm skip over subregisters indexes that would cause an overlap with already-covered lanes.

Note that in the case of AMDGPU, this problem is caused by the lack of subregisters indexes for 13/14/15-register tuples. We have everything up until 12, then we have 16 and 32 but nothing between 12 and 16.
This means that the best candidate to do the least amount of copies when splitting a 29-register tuple was to copy (e.g.) 0-15 and 14-29, causing an overlap.
With this change, getCoveringSubRegIndexes will now prefer using something like 0-15, 16-28 and 1

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D141576

Added: 
    

Modified: 
    llvm/lib/CodeGen/TargetRegisterInfo.cpp
    llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir
    llvm/test/CodeGen/AMDGPU/split-liverange-overlapping-copies.mir

Removed: 
    


################################################################################
diff  --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index 38a3e68912f4c..a41d5999d9610 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -571,10 +571,14 @@ bool TargetRegisterInfo::getCoveringSubRegIndexes(
         break;
       }
 
-      // Try to cover as much of the remaining lanes as possible but
-      // as few of the already covered lanes as possible.
-      int Cover = (SubRegMask & LanesLeft).getNumLanes() -
-                  (SubRegMask & ~LanesLeft).getNumLanes();
+      // Do not cover already-covered lanes to avoid creating cycles
+      // in copy bundles (= bundle contains copies that write to the
+      // registers).
+      if ((SubRegMask & ~LanesLeft).any())
+        continue;
+
+      // Try to cover as many of the remaining lanes as possible.
+      const int Cover = (SubRegMask & LanesLeft).getNumLanes();
       if (Cover > BestCover) {
         BestCover = Cover;
         BestIdx = Idx;

diff  --git a/llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir b/llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir
index 31e19475a40ea..41b85ce5c6e02 100644
--- a/llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir
+++ b/llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir
@@ -36,7 +36,8 @@ body:             |
   ; CHECK-NEXT:   successors: %bb.3(0x80000000)
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   undef %6.sub1_sub2_sub3_sub4_sub5_sub6_sub7_sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15_sub16:av_1024_align2 = COPY [[COPY]].sub1_sub2_sub3_sub4_sub5_sub6_sub7_sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15_sub16 {
-  ; CHECK-NEXT:     internal %6.sub16_sub17_sub18_sub19_sub20_sub21_sub22_sub23_sub24_sub25_sub26_sub27_sub28_sub29_sub30_sub31:av_1024_align2 = COPY [[COPY]].sub16_sub17_sub18_sub19_sub20_sub21_sub22_sub23_sub24_sub25_sub26_sub27_sub28_sub29_sub30_sub31
+  ; CHECK-NEXT:     internal %6.sub17_lo16_sub17_hi16_sub18_lo16_sub18_hi16_sub19_lo16_sub19_hi16_sub20_lo16_sub20_hi16_sub21_lo16_sub21_hi16_sub22_lo16_sub22_hi16_sub23_lo16_sub23_hi16_sub24_lo16_sub24_hi16_sub25_lo16_sub25_hi16_sub26_lo16_sub26_hi16_sub27_lo16_sub27_hi16_sub28_lo16_sub28_hi16:av_1024_align2 = COPY [[COPY]].sub17_lo16_sub17_hi16_sub18_lo16_sub18_hi16_sub19_lo16_sub19_hi16_sub20_lo16_sub20_hi16_sub21_lo16_sub21_hi16_sub22_lo16_sub22_hi16_sub23_lo16_sub23_hi16_sub24_lo16_sub24_hi16_sub25_lo16_sub25_hi16_sub26_lo16_sub26_hi16_sub27_lo16_sub27_hi16_sub28_lo16_sub28_hi16
+  ; CHECK-NEXT:     internal %6.sub29_sub30_sub31:av_1024_align2 = COPY [[COPY]].sub29_sub30_sub31
   ; CHECK-NEXT:   }
   ; CHECK-NEXT:   %6.sub0:av_1024_align2 = IMPLICIT_DEF
   ; CHECK-NEXT:   S_NOP 0, implicit %6.sub0

diff  --git a/llvm/test/CodeGen/AMDGPU/split-liverange-overlapping-copies.mir b/llvm/test/CodeGen/AMDGPU/split-liverange-overlapping-copies.mir
index 4cb2b6f97e051..f4cf0f43e456b 100644
--- a/llvm/test/CodeGen/AMDGPU/split-liverange-overlapping-copies.mir
+++ b/llvm/test/CodeGen/AMDGPU/split-liverange-overlapping-copies.mir
@@ -41,7 +41,8 @@ body:             |
   ; CHECK-NEXT:   successors: %bb.3(0x80000000)
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   undef %6.sub1_sub2_sub3_sub4_sub5_sub6_sub7_sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15_sub16:av_1024_align2 = COPY [[COPY]].sub1_sub2_sub3_sub4_sub5_sub6_sub7_sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15_sub16 {
-  ; CHECK-NEXT:     internal %6.sub16_sub17_sub18_sub19_sub20_sub21_sub22_sub23_sub24_sub25_sub26_sub27_sub28_sub29_sub30_sub31:av_1024_align2 = COPY [[COPY]].sub16_sub17_sub18_sub19_sub20_sub21_sub22_sub23_sub24_sub25_sub26_sub27_sub28_sub29_sub30_sub31
+  ; CHECK-NEXT:     internal %6.sub17_lo16_sub17_hi16_sub18_lo16_sub18_hi16_sub19_lo16_sub19_hi16_sub20_lo16_sub20_hi16_sub21_lo16_sub21_hi16_sub22_lo16_sub22_hi16_sub23_lo16_sub23_hi16_sub24_lo16_sub24_hi16_sub25_lo16_sub25_hi16_sub26_lo16_sub26_hi16_sub27_lo16_sub27_hi16_sub28_lo16_sub28_hi16:av_1024_align2 = COPY [[COPY]].sub17_lo16_sub17_hi16_sub18_lo16_sub18_hi16_sub19_lo16_sub19_hi16_sub20_lo16_sub20_hi16_sub21_lo16_sub21_hi16_sub22_lo16_sub22_hi16_sub23_lo16_sub23_hi16_sub24_lo16_sub24_hi16_sub25_lo16_sub25_hi16_sub26_lo16_sub26_hi16_sub27_lo16_sub27_hi16_sub28_lo16_sub28_hi16
+  ; CHECK-NEXT:     internal %6.sub29_sub30_sub31:av_1024_align2 = COPY [[COPY]].sub29_sub30_sub31
   ; CHECK-NEXT:   }
   ; CHECK-NEXT:   %6.sub0:av_1024_align2 = IMPLICIT_DEF
   ; CHECK-NEXT:   S_NOP 0, implicit %6.sub0
@@ -116,7 +117,8 @@ body:             |
   ; CHECK-NEXT:   successors: %bb.3(0x80000000)
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   undef %6.sub1_sub2_sub3_sub4_sub5_sub6_sub7_sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15_sub16:av_1024 = COPY [[COPY]].sub1_sub2_sub3_sub4_sub5_sub6_sub7_sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15_sub16 {
-  ; CHECK-NEXT:     internal %6.sub15_sub16_sub17_sub18_sub19_sub20_sub21_sub22_sub23_sub24_sub25_sub26_sub27_sub28_sub29_sub30:av_1024 = COPY [[COPY]].sub15_sub16_sub17_sub18_sub19_sub20_sub21_sub22_sub23_sub24_sub25_sub26_sub27_sub28_sub29_sub30
+  ; CHECK-NEXT:     internal %6.sub17_lo16_sub17_hi16_sub18_lo16_sub18_hi16_sub19_lo16_sub19_hi16_sub20_lo16_sub20_hi16_sub21_lo16_sub21_hi16_sub22_lo16_sub22_hi16_sub23_lo16_sub23_hi16_sub24_lo16_sub24_hi16_sub25_lo16_sub25_hi16_sub26_lo16_sub26_hi16_sub27_lo16_sub27_hi16_sub28_lo16_sub28_hi16:av_1024 = COPY [[COPY]].sub17_lo16_sub17_hi16_sub18_lo16_sub18_hi16_sub19_lo16_sub19_hi16_sub20_lo16_sub20_hi16_sub21_lo16_sub21_hi16_sub22_lo16_sub22_hi16_sub23_lo16_sub23_hi16_sub24_lo16_sub24_hi16_sub25_lo16_sub25_hi16_sub26_lo16_sub26_hi16_sub27_lo16_sub27_hi16_sub28_lo16_sub28_hi16
+  ; CHECK-NEXT:     internal %6.sub29_sub30:av_1024 = COPY [[COPY]].sub29_sub30
   ; CHECK-NEXT:   }
   ; CHECK-NEXT:   %6.sub0:av_1024 = IMPLICIT_DEF
   ; CHECK-NEXT:   %6.sub31:av_1024 = IMPLICIT_DEF


        


More information about the llvm-commits mailing list