[PATCH] D156850: [NFC][Coroutines] Use a reverse post-order to guide the computation about cross suspend infomation to reach a fixed point faster.

witstorm via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 1 22:33:05 PDT 2023


witstorm95 created this revision.
Herald added subscribers: ChuanqiXu, hiraditya.
Herald added a project: All.
witstorm95 requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Fixed https://github.com/llvm/llvm-project/issues/62348

Propagate cross suspend point information along reverse post-order.
It does not modify the original function, just selects a better traversal order.

Before the patch:

  n: 20000
  4.31user 0.11system 0:04.44elapsed 99%CPU (0avgtext+0avgdata 552352maxresident)k
  0inputs+8848outputs (0major+126254minor)pagefaults 0swaps
  
  n: 40000
  11.24user 0.40system 0:11.66elapsed 99%CPU (0avgtext+0avgdata 1788404maxresident)k
  0inputs+17600outputs (0major+431105minor)pagefaults 0swaps
  
  n: 60000
  21.65user 0.96system 0:22.62elapsed 99%CPU (0avgtext+0avgdata 3809836maxresident)k
  0inputs+26352outputs (0major+934749minor)pagefaults 0swaps
  
  n: 80000
  37.05user 1.53system 0:38.58elapsed 99%CPU (0avgtext+0avgdata 6602396maxresident)k
  0inputs+35096outputs (0major+1622584minor)pagefaults 0swaps
  
  n: 100000
  51.87user 2.67system 0:54.54elapsed 99%CPU (0avgtext+0avgdata 10210736maxresident)k
  0inputs+43848outputs (0major+2518945minor)pagefaults 0swaps

After the patch:

  n: 20000
  3.08user 0.12system 0:03.21elapsed 99%CPU (0avgtext+0avgdata 551012maxresident)k
  0inputs+8848outputs (0major+129349minor)pagefaults 0swaps
  
  n: 40000
  5.88user 0.33system 0:06.22elapsed 99%CPU (0avgtext+0avgdata 1789248maxresident)k
  0inputs+17600outputs (0major+435096minor)pagefaults 0swaps
  
  n: 60000
  8.84user 0.77system 0:09.63elapsed 99%CPU (0avgtext+0avgdata 3807800maxresident)k
  0inputs+26352outputs (0major+939119minor)pagefaults 0swaps
  
  n: 80000
  11.64user 1.58system 0:13.23elapsed 99%CPU (0avgtext+0avgdata 6604708maxresident)k
  0inputs+35096outputs (0major+1629566minor)pagefaults 0swaps
  
  n: 100000
  15.21user 2.56system 0:17.79elapsed 99%CPU (0avgtext+0avgdata 10208828maxresident)k
  8inputs+43848outputs (0major+2526611minor)pagefaults 0swaps


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D156850

Files:
  llvm/lib/Transforms/Coroutines/CoroFrame.cpp


Index: llvm/lib/Transforms/Coroutines/CoroFrame.cpp
===================================================================
--- llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -112,10 +112,12 @@
   }
 
   /// Compute the BlockData for the current function in one iteration.
-  /// Returns whether the BlockData changes in this iteration.
   /// Initialize - Whether this is the first iteration, we can optimize
   /// the initial case a little bit by manual loop switch.
-  template <bool Initialize = false> bool computeBlockData();
+  /// The parameter "RPOT" is a reverse post order.
+  /// Returns whether the BlockData changes in this iteration.
+  template <bool Initialize = false>
+  bool computeBlockData(ReversePostOrderTraversal<Function *> &RPOT);
 
 public:
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
@@ -223,12 +225,15 @@
 }
 #endif
 
-template <bool Initialize> bool SuspendCrossingInfo::computeBlockData() {
-  const size_t N = Mapping.size();
+template <bool Initialize>
+bool SuspendCrossingInfo::computeBlockData(
+    ReversePostOrderTraversal<Function *> &RPOT) {
   bool Changed = false;
 
-  for (size_t I = 0; I < N; ++I) {
-    auto &B = Block[I];
+  /// Use reverse post order to guide the computation.
+  for (auto BB : RPOT) {
+    auto BBNo = Mapping.blockToIndex(BB);
+    auto &B = Block[BBNo];
 
     // We don't need to count the predecessors when initialization.
     if constexpr (!Initialize)
@@ -261,7 +266,7 @@
     }
 
     if (B.Suspend) {
-      // If block S is a suspend block, it should kill all of the blocks it
+      // If block B is a suspend block, it should kill all of the blocks it
       // consumes.
       B.Kills |= B.Consumes;
     } else if (B.End) {
@@ -273,8 +278,8 @@
     } else {
       // This is reached when B block it not Suspend nor coro.end and it
       // need to make sure that it is not in the kill set.
-      B.KillLoop |= B.Kills[I];
-      B.Kills.reset(I);
+      B.KillLoop |= B.Kills[BBNo];
+      B.Kills.reset(BBNo);
     }
 
     if constexpr (!Initialize) {
@@ -283,9 +288,6 @@
     }
   }
 
-  if constexpr (Initialize)
-    return true;
-
   return Changed;
 }
 
@@ -325,9 +327,11 @@
       markSuspendBlock(Save);
   }
 
-  computeBlockData</*Initialize=*/true>();
-
-  while (computeBlockData())
+  /// Use reverse post order to guide the computation. It will lead to reach
+  /// fixed point faster.
+  ReversePostOrderTraversal<Function *> RPOT(&F);
+  computeBlockData</*Initialize=*/true>(RPOT);
+  while (computeBlockData</*Initialize*/ false>(RPOT))
     ;
 
   LLVM_DEBUG(dump());


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D156850.546320.patch
Type: text/x-patch
Size: 2629 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230802/ae6a0456/attachment.bin>


More information about the llvm-commits mailing list