[PATCH] D12781: PGO IR-level instrumentation infrastructure

Wed Sep 23 15:28:18 PDT 2015

On Tue, Sep 22, 2015 at 6:09 PM, Sean Silva <chisophugis at gmail.com> wrote:

> silvas added a reviewer: kcc.
> silvas added a comment.
>
> This should not be called "late instrumentation". That is clang-specific
> terminology. Think from the perspective of a random frontend (say, one that
> doesn't have any other PGO mechanism). There should be no mention of "FE
> instrumentation", "late instrumentation" "IR-level instrumentation", etc.
> This is just "edge profiling" or something like that.
>

these terms are mainly for the reviewers to distinguish the two
instrumentation. I'll clean up the code and comments to use "edge
profiling".

>
> A couple other high-level questions/comments:
>
> - is there a reason that this is breaking critical edges itself instead of
> using break-crit-edges pass? Is break-crit-edges dead? SanitizerCoverage
> doesn't seem to use it either.
>

As I mentioned in the email to Kostya: this is to avoid unnecessary split.
We only need to split a small set of the critical edges.

> - why is this a module pass? It operates entirely on a
> function-by-function basis.
>
Good question. When I started the implementation, I thought I would create
global functions and global vars that needed to be a module pass. But now
it seems function pass should work. I'll change this in a later patch.

- can we use GraphTraits for the MST algorithm, so that it is reusable?
> (could probably be a separate patch if so).
>
This can be done in the future in a later patch. Not sure what other pass
would use MST.

> - could we break the PGO data reading into a separate analysis? Also could
> the pass be split into more fine-grained passes? (I think the answer is yes
> to both questions)
>

I would prefer to put instrumentation and reading profile (read the count
out and populate to each BB) in the same pass. This two passes need to see
the matched IR. Separating them will create more changes of mismatches.

But I could move some of the code, like setting function attributes into a
separated pass.

> - We should directly accept a buffer containing the profile feedback data
> instead of a file name. That lets the frontend do the error handling, and
> is more flexible for usage as a library.
>

Good suggestion. I'll work on this.

>
> Adding Kostya, as this has a lot of similarities to sanitizer coverage;
> also just for general knowledge of instrumentation passes.
>
>
> ================
> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:122-126
> @@ +121,7 @@
> +
> +  // Threshold of the hot functions.
> +  const float HotFunctionThreshold = 0.01;
> +
> +  // Threshold of the cold functions.
> +  const float ColdFunctionThreshold = 0.0002;
> +
> ----------------
> Use integers.
>
> ================
> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:301-302
> @@ +300,4 @@
> +
> +// Compute the CRC32 of unsigned integers.
> +unsigned PGOLateInstrumentationFunc::crc32UnsignedBits(unsigned Chksum,
> +                                                       unsigned Value,
> ----------------
> Why crc32?
>
> ================
> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:313-314
> @@ +312,4 @@
> +
> +// Compute Hash value for the CFG: the lower 32 bits are CRC32 of the
> index
> +// value of each BB in the CFG. The higher 32 bits record the number of
> edges.
> +void PGOLateInstrumentationFunc::computeCFGHash() {
> ----------------
> This seems like a very poor use of the bits of the hash. Why not use md5
> like clang's `PGOHash`?
>
>
> http://reviews.llvm.org/D12781
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150923/634db6ac/attachment.html>