[PATCH] D93556: [CSSPGO][llvm-profgen] Compress recursive cycles in calling context

Fri Jan 15 09:57:34 PST 2021

wlei added inline comments.

================
Comment at: llvm/tools/llvm-profgen/ProfileGenerator.h:63
+    compressRecursionContext<T>(
+        Context, CompressionSize == -1 ? Context.size() : CompressionSize);
+  }
----------------
hoy wrote:
> Moving the implementation to ProfileGenerator.cpp so that `RecursionCompression` can be checked instead of using an extra global?
Yes, I tried this but failed because of the unit test which need to link the cpp file and the dependence need to include many other files. Not sure other ways to link the a Tool dir to the unit test, so I chose to put it into header file. what do you think?

================
Comment at: llvm/tools/llvm-profgen/ProfileGenerator.h:65
+  }
+  // Remove adjacent repeated context sequences up to a given sequene length.
+  // Note that repeated sequences are identified based on the exact call site,
----------------
hoy wrote:
> typo: sequence
fixed

================
Comment at: llvm/tools/llvm-profgen/ProfileGenerator.h:108-110
+          // Populate the non-common-suffix part of the adjacent sequence.
+          std::copy(BeginIter + Right + 1, BeginIter + Left + I + 1,
+                    BeginIter + End);
----------------
wlei wrote:
> hoy wrote:
> > wmi wrote:
> > > hoy wrote:
> > > > wlei wrote:
> > > > > wmi wrote:
> > > > > > wmi wrote:
> > > > > > > wlei wrote:
> > > > > > > > wmi wrote:
> > > > > > > > > wlei wrote:
> > > > > > > > > > wmi wrote:
> > > > > > > > > > > Could you give an example of what the sequence looks like after the population?
> > > > > > > > > > Sure, an example is added in the code comments
> > > > > > > > > Thanks for the example. I can understand where are the redundent comparisons but I havn't understood how to skip the windows by doing the population (std::copy). I don't see the std::copy change anything for the example below. Could you clarify? 
> > > > > > > > > 
> > > > > > > > > Similarly I don't see how the duplicated str is removed from Context sequence when Duplication is found above (if (Right - Left == I) is true). Could you also clarify that?
> > > > > > > > Yeah, our original design is to use a new vector to store the compressed result, then we changed to use in-place algorithm to reduce the memcpy which make readability worse. 
> > > > > > > > rewrote the code comments, added another example, please see this version is clear or not?
> > > > > > > > 
> > > > > > > Thanks, now I understand it better. However, I am surprised seems you doesn't remove the redundent comparison which I thought you tried to remove.
> > > > > > > 
> > > > > > > suppose you have the following sequence:
> > > > > > > a b c d b c d
> > > > > > > 
> > > > > > > Considering I==3.
> > > > > > > In the first iteration. abc and dbc are compared, so Left==0 and Right==2, The algorithm find the first non-common place is Left==0.  Then it updates Right to be Left + I = 3 at the end of the iteration. Note in this iteration, it have compared the common parts so it already knows the 2th/5th char in the string are the same --> 'b', and 3th/6th chars in the string are the same --> 'd'.
> > > > > > > 
> > > > > > > In the next iteration, the algorithm executes the comparison loop again:
> > > > > > >        uint32_t Left = Right;
> > > > > > >        while (Right - Left < I && Context[Left] == Context[Left + I]) {
> > > > > > >           // Find the longest suffix inside the window. When stops, Left points
> > > > > > >           // at the diverging point in the current sequence.
> > > > > > >           Left--;
> > > > > > >        }
> > > > > > > With Right==3, it will compare 3 chars before it can confirm there is duplication found, so it will compare 2th char and 5th char, 4th char and 6th char again which it already know they are the same in the first iteration. 
> > > > > > > 
> > > > > > > Do I understand it correctly?
> > > > > > > 
> > > > > > > 
> > > > > > Sorry: some typos:
> > > > > > 
> > > > > > > 2th/5th char in the string are the same --> 'b', and 3th/6th chars in the string are the same --> 'd'.
> > > > > > 
> > > > > > 'd' ==> 'c'
> > > > > > 
> > > > > > > so it will compare 2th char and 5th char, 4th char and 6th char again which it already know they are the same in the first iteration.
> > > > > > 
> > > > > > 4th char and 6th char again ==> 3th char and 6th char again
> > > > > > 
> > > > > Yes, your'are right. I think we miss this, your case is another optimizing opportunity.
> > > > > 
> > > > > This code's redundancy removing is like below:
> > > > > 
> > > > > 0 1 2 3 4 5 6 7 8
> > > > > a b c a b d a b d
> > > > > I = 3 
> > > > > 
> > > > > For 1st iteration, abc and abd are compared, we know c and d is different. So Left is 2 and Right is 2. Then Right will be Left + I = 5. It means to avoid comparing [bca, bda] and [cab, dab] since we already know  the comparison between `c` and `d` is false.
> > > > > 
> > > > > The 2nd iteration will just compare [abd, abd]
> > > > > 
> > > > > 
> > > > > So let me implement it based on your idea, thanks for your feedback.
> > > > Yes, that's a missing opportunity. Once a common suffix is found, it will become the common prefix for the next compare. Then the next compare will not need to compare the common prefix again. The optimization was not done because it adds complexity and does not change the complexity, still O(n).
> > > > 
> > > > Thanks for pointing this out, Wei and thanks for giving it a shot Lei!
> > > > The optimization was not done because it adds complexity and does not change the complexity, still O(n).
> > > 
> > > Agree. I don't know how much cpu and memory overhead are spent on compression for large case. What is your experience on it? That will help to decide the tradeoff between optimization and complexity.
> > We haven't measured that. But I think this optimization is worse pursuing. We just need a variable to store the length of the last common suffix and stop the comparing when the rest of string to compare is right in that length.
> Yeah, I think that shouldn't need too much work. Let me try it.
Implemented, added a new variable `LeftBoundary` for the check and the Left pointer will end at it not the Right - I.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93556/new/

https://reviews.llvm.org/D93556