[PATCH] Initial instrumentation based PGO implementation

Bob Wilson bob.wilson at apple.com
Mon Dec 16 14:14:23 PST 2013


LGTM.

On Dec 13, 2013, at 1:43 PM, Justin Bogner <mail at justinbogner.com> wrote:

> New patch attached. The counters are now only accessed via RegionCounter
> objects, there's some better documentation of PGOProfileData, PGO
> creates the RegionCounts vector instead of it being allocated inside
> PGOProfileData, and we check for absurdly large data files.
> 
> I've reattached the command line flag and compiler-rt patches to keep
> everything in one place, but they're unchanged.
> 
> <0001-Driver-Accept-fprofile-instr-use-and-fprofile-instr-.patch><0002-CodeGen-Initial-instrumentation-based-PGO-implementa.patch><0001-profile-Rudimentary-suppport-for-PGO-instrumentation.patch>
> Justin Bogner <mail at justinbogner.com> writes:
>> John McCall <rjmccall at apple.com> writes:
>>>> 
>>> <0001-profile-Rudimentary-suppport-for-PGO-instrumentation.patch><0001-Driver-Accept-fprofile-instr-use-and-fprofile-instr-.patch><0002-CodeGen-Initial-instrumentation-based-PGO-implementa.patch>
>>> 
>>> +  unsigned Counter = PGO.getRegionCounter(&S);
>>> +  RegionCounter Cnt(PGO, Counter);
>>> 
>>> There are a number of places where you do this instead of
>>> getPGORegionCounter, maybe so that you can use Counter separately.
>>> Just use Cnt.getCounter() downstream instead.  (You should have a
>>> getCounter() method on that type.)
>> 
>> Good point. I think I can do one better than that and avoid needing
>> access to the counter indices completely. More on that below.
>> 
>>> + unsigned LoopCnt
>>> 
>>> There are a bunch of places you pass counts around as unsigned, and
>>> then a bunch of other times that you pass them as uint64_t.  That’s
>>> problematic; so is the ease of passing other kinds of data
>>> accidentally as unsigned.  The sky will be brighter and the day
>>> merrier if you make an opaque wrapper type for this.  It’ll also make
>>> your rampant use of signal values much more self-documenting.
>>> 
>>> The same goes for counter indexes, where the size problems are less
>>> but the risk of accidentally using a counter where you meant to use a
>>> count is much larger.
>> 
>> The example here is actually a counter index, but there is one place
>> where an unsigned is used for a counter value (The DefaultCount in a
>> switch). The current set up makes it much too easy to mix up counter
>> indices and counts, especially with the similar names. This will be much
>> clearer if we hide the counter indices and work with the RegionCounter
>> type instead of unsigned. I think leaving the counts themselves as
>> uint64_t makes sense given that change.
>> 
>> I'm not really sure what you're talking about with regard to signal
>> values. There are a couple of pointers that can be null, and the "three
>> counters for a loop" thing is a bit odd, but as far as counter indices
>> and counts go, every value is valid. Would you mind explaining?
>> 
>>> +class PGOProfileData {
>>> +private:
>>> +  llvm::OwningPtr<llvm::MemoryBuffer> DataBuffer;
>>> 
>>> +  llvm::StringMap<unsigned> DataOffsets;
>>> 
>>> Please document that these are offsets into the buffer.
>>> 
>>> It’s probably reasonable to assume that offsets fit within an
>>> unsigned, but you should at least protect against huge profile files.
>> 
>> Will do.
>> 
>>> +PGOProfileData::PGOProfileData(CodeGenModule &CGM, std::string Path)
>>> 
>>> Why is this being taken as a std::string, and why is it being taken by value?
>>> 
>>> +  std::vector<uint64_t> *getFunctionCounts(StringRef MangledName);
>>> 
>>> If you’re going to heap-allocate this (for some reason?), please at
>>> least use an owning pointer.
>> 
>> Heap allocating this is silly and unnecessary. I don't know what I was
>> thinking. I'll fix that.
>> 
>>> +  /// Given a region counter for a loop, return the counter for a break
>>> + /// statement within the loop. The result is undefined for
>>> non-loop counters.
>>> +  unsigned getBreakCounter(unsigned BaseCounter) { return BaseCounter + 1; }
>>> 
>>> Interesting.  You want a single counter associated with all the breaks
>>> in a loop?
>> 
>> Yep. Breaks and continues get a single counter each, so that we can
>> differentiate them from other kinds of exits (return, goto). We can then
>> just look at the difference between entries to the loop body and exits,
>> and adjust using these counters to work out the counts in all the places
>> we're interested in. See Emit(While|Do|For|CXXForRange)Stmt for details.
>> 
>>> +  CodeGenPGO PGO;
>>> +
>>> 
>>> Why are we not following the DebugInfo lead and making this something
>>> external that only gets created if PGO is actually enabled and data is
>>> present for this function?  If you’re worried about allocation
>>> overhead, you can at least make it an Optional.
>> 
>> I'm not really familiar with DebugInfo, but making this a nullable
>> pointer would force everything in CodeGen that deals with counters to
>> frequently check whether or not PGO is enabled, wouldn't it? I might be
>> missing something here.
>> 
>> Thanks for the feedback!
>> -- Justin





More information about the cfe-commits mailing list