[llvm-dev] Reining in profile instrumentation

Mon Dec 19 10:52:41 PST 2016

> On Dec 19, 2016, at 5:58 AM, Martin J. O'Riordan <martin.oriordan at movidius.com> wrote:
> 
> Thanks Vedant, and my apologies for the delay getting back to you - work got "busy".

No problem :).

> I wasn't aware of the '-fprofile-generate' option, so thanks for point this out.  I have tried running it and I can see the instrumentation hooks that it generates - I assume that there is a library I have to implement to support this, can you let me know where the source for this library is?

It's compiler-rt's libclang_rt.profile_$platform.a.

(See compiler-rt/lib/profile.)

> This approach uses the C++ ctor initialisation support which is generally fine.  However, in many cases in our embedded target programmers often forbid using static objects so that they can eliminate the start-up overhead of their initialisation; but that's another issue.

It's possible to use the profiling runtime without static initializers:

http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#using-the-profiling-runtime-without-static-initializers

> So the reason we cannot simply exclude source files from instrumentation, is that the majority of the real code involved tends to reside in the headers in the source for massively inlined template classes, and it is the instrumentation of these that is creating the real problem.  And we do want to profile our own functions in the source file itself.  For instance, a simple accessor function such as:
> 
>  // From 'header.h'
>  struct X {
>    int k;
>    int getK() const { return k; }
>    ...
>  };
> 
>  // In 'source.cpp'
>  #include "header.h"
>  ...
>  X anX;
>  ...
>  int check = anX.getK();
> 
> Now the tiny accessor function which is usually trivially eliminated during inlining, is unnecessarily instrumented with the '__cyg_profile_func_enter' and '__cyg_profile_func_exit' calls, as well as the calling function.
> 
> Magnify this by the expansion and inlining of many hundreds of such functions, and the overhead becomes very large.  And unfortunately, it also hides the true cost of the component that the programmer actually wants to measure.  This is why I was wondering was there a '#pragma' that might allow me to write (contrived '#pragma' syntax):
> 
>  // In 'source.cpp'
>  #pragma push profile instrumentation
>  #pragma disable profile instrumentation
>  #include "header.h"
>  #pragma pop profile instrumentation
>  ...
>  X anX;
>  ...
>  int check = anX.getK();
> 
> or an alternative mechanism.  The GCC compiler has the options '-finstrument-functions-exclude-file-list' and '-finstrument-functions-exclude-function-list' for this purpose, but these are not available in CLang/LLVM.

There isn't such a pragma right now. To implement this, I think we'd need the
frontend to attach a 'no_instrument' attribute to functions that need to be
skipped. Next, we'd need to update the frontend and IR based instrumentation
logic to respect that attribute.

best,
vedant

> I will experiment with the '-mllvm -disable-preinline' option, thanks for telling me about this too.
> 
> All the best,
> 
>    MartinO
> 
> -----Original Message-----
> From: vsk at apple.com [mailto:vsk at apple.com] 
> Sent: 13 December 2016 23:12
> To: Martin J. O'Riordan
> Cc: LLVM Developers
> Subject: Re: [llvm-dev] Reining in profile instrumentation
> 
> 
>> On Dec 13, 2016, at 3:46 AM, Martin J. O'Riordan via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> 
>> When either ‘-pg’ or ‘-finstrument-functions’ is used, the compiler inserts the appropriate profiling hooks.  This happens prior to inlining, so the hooks remain in place.
> 
> Have you tried compiling with -fprofile-generate? It enables IR-based profiling instrumentation, which has supported pre-inlining since r275588. That should mitigate the issue you're seeing with excessive instrumentation.
> 
> 
>> Normally this is fine, but with C++ and the heavy use of inline functions and templates, there can be a vast number of trivial functions that are normally optimised away; but with the instrumentation hooks present, this does not happen and the code becomes severely larger and more expensive to execute.  Also, because of this, the program being profiled does not even approximately resemble the normal program with no profiling hooks, so the data gathered is of little use.
> 
> The pre-inlining should address this issue. E.g, if A calls B, B calls C, and
> B+C are inlined into A, then the profile you'd get back is {1, 0, 0}. 
> B+Without
> pre-inlining, you'd get back {1, 1, 1}.
> 
> That said, I don't know what kinds of issues this would cause in practice. I'd really like to hear about how the performance of your optimized application changes when you turn pre-inlining on during the instrumentation step.
> 
> You can experiment with this with -mllvm -disable-preinline.
> 
> 
>> My question is whether there are any mechanisms in LLVM to control what functions get instrumented; for instance ‘#pragma’s that can be added to the code, especially headers, that can be used to disable the instrumentation of large groups of functions.  Or an option to remove the instrumentation during inlining?
> 
> Not that I'm aware of. One option is to not pass -fprofile-blah into translation units you don't want instrumented.
> 
> best,
> vedant
> 
>> 
>> But I really do need a way of preventing the instrumentation of large numbers of functions is a simple way.
>> 
>> Thanks,
>> 
>>            MartinO
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>