[PATCH] D34838: Prototype: Reduce llvm-profdata merge memory usage further

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 5 23:20:27 PDT 2017


On Wed, Jul 5, 2017 at 1:57 PM Xinliang David Li <davidxl at google.com> wrote:

> Also, the InstrProfRecordWithKey can inherit from InstProfRecord ...
>
> David
>
> On Wed, Jul 5, 2017 at 1:54 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Wed, Jul 5, 2017 at 1:27 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Jun 29, 2017 at 7:48 PM David Li via Phabricator <
>>> reviews at reviews.llvm.org> wrote:
>>>
>>>> davidxl added a comment.
>>>>
>>>> In summary, I think this patch should be minimized by
>>>>
>>>
>>> While I appreciate/value minimizing churn - I'm mostly motivated
>>> by/trying to ensure the best form for the code, rather than the smallest
>>> change here.
>>>
>>
>>
>> It is not just about minimizing the churns.  Currently the class
>> InstrProfRecord abstracts everything the client needs to know about the
>> per-function profile which is nice thing we'd like to maintain.
>>
>>
>>>
>>> If they happen to coincide, that's great.
>>>
>>>
>>>> 1. sinking the minimal number of interfaces into Counter subclass
>>>> 2. for any sunk interface, provide a wrapper in the parent class
>>>> InstrProfRecord
>>>> 3. avoid exposing internal members to client side other than
>>>> InstrProfWriters (may be making those method private and making Writer a
>>>> friend ?)
>>>>
>>>
>>> I did consider this (& certainly open to discussing the tradeoff further
>>> to see if it's the right one) - my thinking was more that this change
>>> indicates a new use case/way to pivot the InstrProfRecord data structure
>>> that's not exclusively in the domain of the InstrProfWriter.
>>>
>>> Looking at it that way, it seems to me that it's a good thing to move
>>> more parts of the code over to only caring about the part of the data
>>> structure that's relevant to them - which in many cases is just the
>>> counters (though I'd love a better name that "CounterHolder" (& could make
>>> it a non-nested class, especially if it had a good standalone name) - even
>>> potentially using the InstrProfRecord as the name for this name/hash-less
>>> part of the data and "NamedInstrProfRecord" for the
>>> Name+Hash+InstrProfRecord - though I haven't prototyped that/given it a lot
>>> of thought)
>>>
>>
>> I like the suggestion to promote the CounterHolder class into the first
>> class citizen -- InstrProfRecord with the existing InstrProfRecord be
>> renamed to NamedInstrProfRecord or InstrProfRecordWithKey.   I expect
>> majority of the APIs do not care about the name key anyway. The only
>> affected interfaces are related to profile reader iterators and a few
>> writer changes.
>>
>
Ah, cool - the inheritance you suggested was a handy idea too.

I've updated with this change - hopefully removed any of the lingering
changes.

It could be taken a little further - currently the only place that really
'needs' to expose NamedInstrProfRecord is the InstrProfReader's iteration
API (the query (name+hash -> record) API doesn't need to return a
NamedInstrProfRecord because the user passed in name+hash, so they clearly
already know that). NamedInstrProfRecord could be moved into
InstrProfReader as only relevant there - though currently llvm-profdata.cpp
uses the NamedInstrProfRecord and passes it back to InstrProfWriter - but
it could be decomposed into Name+Hash+InstrProfRecord there & then
InstrProfWriter wouldn't know anything about NamedInstrProfRecord and it'd
really be only part of InstrProfReader.

Happy to do that if you think it's a good idea/worth a shot (either in this
or in a follow-up).

- Dave


>
>> I think the above will be clean.
>>
>> thanks,
>>
>> David
>>
>>
>>>
>>>
>>>> 4. split out unrelated changes
>>>>
>>>>
>>>>
>>>> ================
>>>> Comment at: include/llvm/ProfileData/InstrProf.h:253
>>>> +                       ArrayRef<InstrProfValueData> VDs, uint64_t Sum,
>>>> +                       InstrProfValueKind ValueKind, uint32_t
>>>> MaxMDCount);
>>>>
>>>> ----------------
>>>> unrelated format change?
>>>>
>>>>
>>>> ================
>>>> Comment at: include/llvm/ProfileData/InstrProf.h:701
>>>> +    /// counts of all target values at this site.
>>>> +    inline uint64_t getValueForSite(InstrProfValueData Dest[],
>>>> +                                    uint32_t ValueKind, uint32_t Site)
>>>> const;
>>>> ----------------
>>>> format change?
>>>>
>>>>
>>>> ================
>>>> Comment at: include/llvm/ProfileData/InstrProfData.inc:317
>>>>     */
>>>> -  void deserializeTo(InstrProfRecord &Record,
>>>> -                     InstrProfRecord::ValueMapType *VMap);
>>>> +  void deserializeTo(InstrProfRecord::CounterHolder &Record,
>>>> +                     InstrProfRecord::CounterHolder::ValueMapType
>>>> *VMap);
>>>> ----------------
>>>> Why this change? IMO, we should reduce exposing CounterHolder if
>>>> possible  and maintain/use APIs at the InstrProfRecord level.
>>>>
>>>>
>>>> ================
>>>> Comment at: include/llvm/ProfileData/InstrProfData.inc:361
>>>>     */
>>>> -  static uint32_t getSize(const InstrProfRecord &Record);
>>>> +  static uint32_t getSize(const InstrProfRecord::CounterHolder
>>>> &Counters);
>>>>    /*!
>>>> ----------------
>>>> Same here.
>>>>
>>>>
>>>> ================
>>>> Comment at: include/llvm/ProfileData/InstrProfData.inc:530
>>>>  INSTR_PROF_VISIBILITY uint32_t
>>>> -getValueProfDataSize(ValueProfRecordClosure *Closure) {
>>>> +getValueProfDataSize(const void *Record, const ValueProfRecordVTable
>>>> *VTable) {
>>>>    uint32_t Kind;
>>>> ----------------
>>>> Can this be split into different patch?
>>>>
>>>>
>>>> ================
>>>> Comment at: include/llvm/ProfileData/InstrProfWriter.h:66
>>>>    /// Write \c Record in text format to \c OS
>>>> -  static void writeRecordInText(const InstrProfRecord &Record,
>>>> +  static void writeRecordInText(StringRef Name, uint64_t Hash,
>>>> +                                const InstrProfRecord::CounterHolder
>>>> &Counters,
>>>> ----------------
>>>> There is no need to change this interface. You are passing name, hash
>>>> etc, so might as well create a InstrProfRecord.
>>>>
>>>
>>> But then the counters must be copied or moved into the InstrProfRecord,
>>> making it destructive (or expensive if it's a copy instead of a move) to
>>> the writing step (not the end of the world for InstrProfWriter - but seems
>>> unnecessarily shuffling things around when this is an implementation detail
>>> of the writer anyway).
>>>
>>>
>>>>
>>>>
>>>> ================
>>>> Comment at: lib/ProfileData/InstrProf.cpp:641
>>>> +static const ValueProfRecordVTable InstrProfRecordCountersVTable = {
>>>> +    getNumValueKindsInstrProfCounters,
>>>> +    getNumValueSitesInstrProfCounters,
>>>> ----------------
>>>> These naming changes are not necessary
>>>>
>>>
>>> Seemed relevant if the name of the data structure being operated on
>>> changes (but I agree, if the design ends up being not to reduce the surface
>>> area of code that only needs to handle the counters, then these names
>>> wouldn't need to change).
>>>
>>>
>>>>
>>>>
>>>> ================
>>>> Comment at: lib/ProfileData/InstrProf.cpp:651
>>>> +uint32_t
>>>> +ValueProfData::getSize(const InstrProfRecord::CounterHolder &Counters)
>>>> {
>>>> +  return getValueProfDataSize(&Counters,
>>>> &InstrProfRecordCountersVTable);
>>>> ----------------
>>>> The interface change here is not necessary.
>>>>
>>>>
>>>> ================
>>>> Comment at: lib/ProfileData/InstrProfWriter.cpp:152
>>>> +              .second; // FIXME: Avoid copying by fixing the addRecord
>>>> API
>>>> +      SummaryBuilder->addRecord(std::move(R));
>>>>
>>>> ----------------
>>>> davidxl wrote:
>>>> > Summary Builder does not need name, hash etc. so perfhaps fixing API
>>>> is the right way.
>>>> On second thought, to reduce exposing implementation details and
>>>> minimizing interface changes, it is cleaner to just do
>>>>
>>>> SummaryBuilder->addRecord(InstrProfRecord(K, ProfileData.first,
>>>> Counters));
>>>>
>>>
>>> But the SummaryBuilder doesn't need/use the name or hash - so it seems
>>> strange to me to pass that. (again, to the design tradeoff/discussion above)
>>>
>>>
>>>>
>>>> Similarly for the SerializeFrom call below and ValueProfData::getSize()
>>>> call above.
>>>>
>>>>
>>>> https://reviews.llvm.org/D34838
>>>>
>>>>
>>>>
>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170706/c3245258/attachment.html>


More information about the llvm-commits mailing list