[cfe-dev] Proposal: add instrumentation for PGO and code coverage

Tue Sep 10 11:35:10 PDT 2013


> -----Original Message-----
> From: Diego Novillo [mailto:dnovillo at google.com]
> Sent: Tuesday, September 10, 2013 6:11 AM
> To: Katzfey, Eric
> Cc: Bob Wilson; clang-dev Developers
> Subject: Re: [cfe-dev] Proposal: add instrumentation for PGO and code
> coverage
> 
> On Mon, Sep 9, 2013 at 4:53 PM, Katzfey, Eric
> <ekatzfey at qti.qualcomm.com> wrote:
> >
> >
> >> -----Original Message-----
> >> From: cfe-dev-bounces at cs.uiuc.edu
> >> [mailto:cfe-dev-bounces at cs.uiuc.edu]
> >> On Behalf Of Diego Novillo
> >> Sent: Saturday, September 07, 2013 6:55 AM
> >> To: Bob Wilson
> >> Cc: clang-dev Developers
> >> Subject: Re: [cfe-dev] Proposal: add instrumentation for PGO and code
> >> coverage
> >>
> >>
> >> In terms of the metadata representation, what are your thoughts on
> >> the on- disk format to use? Since we want to support multiple
> >> external profile sources, I would like to define a canonical on-disk
> >> representation that every profiling source should convert to.
> >>
> >> This way, the profile loader in the compiler needs to handle that
> >> single file format.
> >>
> > [Eric] Yes, I am thinking in terms of embedded targets where
> > instrumentation cannot work. I would like to be able to take trace
> > data generated by the program running on target and then pull out the
> > relevant profile data on branch counts to feed to PGO. How do I create the
> profile format such that the branch counts taken from the code addresses of
> the executable match up to the branches being optimized by the compiler?
> 
> The idea used by AutoFDO (http://gcc.gnu.org/wiki/AutoFDO) is to take the
> sampled data, plus the debug info from the binary and map it to
> source+line numbers.  This gives you an approximate map of block
> frequencies which you then use to estimate branch probabilities.  This is
> what I am currently implementing.
> 
> The process is, by nature, lossy.  But at Google we have found that it
> provides a good balance between ease-of-use and optimization
> opportunities. We avoid the intrusive instrumentation and training steps of
> traditional PGO, while keeping most of its advantages.
[Eric] Okay, that is interesting. I have a tool that analyzes, in real time, trace data streaming from an embedded target processor. After the test run I get a count of how many times each function has been executed. I also get, for each conditional branch in the object code, a count of how many times it branched and how many times it didn't branch. So the trick is to map this into something that the optimizer can use. 

> 
> >
> > Also, to the point of only having one data file per executable, it
> > seems like it would be better to associate a data file per source
> > file. That way I can more easily store my profile data along with my source
> file and it will get used individually when that source file gets pulled into
> different builds.
> 
> Well, the data file will naturally map back to source files via its debug info. Is
> that what you are looking for?
[Eric] It works, I'm just thinking more of logistics.  For example, if I have a module with some small number of files, and this goes into a build with a huge number of files, then the data file I include with my files would be quite large and only have a small amount of usable information in it. Granted, I could use that one data file for all of the files in my module. I guess it could also be an advantage if I get my profiling info from a small module test application.