[PATCH] D123050: [BOLT] Cache-Aware Tail Duplication

Rafael Auler via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 31 16:12:28 PDT 2022


rafauler added inline comments.


================
Comment at: bolt/lib/Passes/TailDuplication.cpp:393
+  if (IsForwardJump) {
+    JumpDistance = (DstAddr + DstSize) - (SrcAddr);
+  } else {
----------------
spupyrev wrote:
> rafauler wrote:
> > By reading "jump distance" I would expect to see
> >   JumpDistance = DstAddr - (SrcAddr + SrcSize)
> > as an approximation of the jump distance in a forward jump.  E.g.
> >   BlockA:  <src address>
> >   nop
> >   nop
> >   jmp BlockB 
> >   BlockB:  <src address + src size>     also  <dstaddr>
> >   nop
> >   nop
> >   nop
> >   <dstaddr + dst size>
> > 
> > In the example above, because jmp BlockB is a fall-through, it should calculate 0 as the distance. Using the definition I provided above, it is zero, but not in the source code. In the source code of this diff, JumpDistance would be "src size + dst size".  Is this to calculate how much cache space is used by using these two blocks? What's the idea? Can you make more explicit the reasoning in the comments? From reading the comments, I got the impression that a FT should be calculated as zero distance.
> Thanks for reading the code so carefully! I think that `distance` is not the best name here and causes the confusion. 
> 
> The idea is to quantify the impact of the block placement on the instruction cache. In particular, we want to distinguish between the case when `BlockB` is "large" and the case when it is "small" (eg a few bytes). It feels that the latter is more i-cache friendly than the former; thus, we'd want to duplicate `BlockB` only when it is "small". Notice that this effect won't be achieved when we compute the "correct" distance as you describe (which is 0 independently of the size of the block).
> 
> The above is of course just an unproved intuition that perform reasonably well in practice. I am happy to experiment with alternatives and/or extensions, if you have any. Also let me know if you see a better name for `distance`.
I think the current implementation makes sense (to quantify size and not just the jump distance) and I don't have any suggestions on improving that. I would probably try something similar. I was just a bit confused because the comment in line 105 of the header file mention that a fallthough jump will map to a 1.0 score. Maybe update the comment?

For the name, I would probably use "JumpScore", but I don't have any strong opinions on it, so if you want to keep JumpDistance, that's fine too.


================
Comment at: bolt/lib/Passes/TailDuplication.cpp:402
+                          opts::TailDuplicationMaxCacheDistance;
+  return (IsForwardJump ? 1.0 : opts::TailDuplicationCacheBackwardWeight) *
+         Prob * Count;
----------------
spupyrev wrote:
> rafauler wrote:
> > Why are we always returning 1.0 in case of a forward jump? Is this correct? From what I read in the comments of this function, I would expect to see Prob * Count here.
> Your assumption is correct, as well as the implementation. Likely my formatting makes the code harder to read.
> Do you think the following equivalent would be more readable?
> 
> ```
> if (IsForwardJump)
>   return Prob * Count;
> else
>   return opts::TailDuplicationCacheBackwardWeight * Prob * Count;
> ```
Ohhh sorry, now I see it. You can keep the current formatting, that's fine. Whichever you prefer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123050/new/

https://reviews.llvm.org/D123050



More information about the llvm-commits mailing list