[PATCH] CodeGen: Emit warnings when dropping profile data during PGO

Justin Bogner mail at justinbogner.com
Thu Apr 10 15:53:54 PDT 2014


Bob Wilson <bob.wilson at apple.com> writes:
> On Apr 10, 2014, at 11:53 AM, Justin Bogner <mail at justinbogner.com> wrote:
>
>> This adds a warning that triggers when profile data doesn't match for
>> the source that's being compiled with -fprofile-instr-use=. This fires
>> only once per translation unit, as warning on every mismatched function
>> would be quite noisy.
>> 
>> The warning is added to DiagnosticSemaKinds because several other
>> warnings in IRGen seem to live there and there is no
>> DiagnosticCodeGenKinds. If there's a more appropriate place, please let
>> me know.
>
>> diff --git a/include/clang/Basic/DiagnosticGroups.td b/include/clang/Basic/Diag\
>> nosticGroups.td
>> index c21a313..d6e0d64 100644
>> --- a/include/clang/Basic/DiagnosticGroups.td
>> +++ b/include/clang/Basic/DiagnosticGroups.td
>> @@ -665,3 +665,6 @@ def BackendInlineAsm : DiagGroup<"inline-asm">;
>>  def BackendFrameLargerThan : DiagGroup<"frame-larger-than">;
>>  def BackendPlugin : DiagGroup<"backend-plugin">;
>>  def RemarkBackendPlugin : DiagGroup<"remark-backend-plugin">;
>> +
>> +// Instrumentation based profiling warnings.
>> +def InstrProfMismatch : DiagGroup<"instr-prof-mismatch">;
>
> I know we had already discussed that warning name, but I’m now
> wondering whether it should be “profile-instr-mismatch” to match the
> command line options. What do you think?

After some thought, I think the parallel with the -f name is
worthwhile. I've gone with profile-instr-dropped

>> diff --git a/include/clang/Basic/DiagnosticSemaKinds.td b/include/clang/Basic/D\
>> iagnosticSemaKinds.td
>> index 0eefcd7..a213430 100644
>> --- a/include/clang/Basic/DiagnosticSemaKinds.td
>> +++ b/include/clang/Basic/DiagnosticSemaKinds.td
>> @@ -6981,4 +6981,10 @@ def warn_not_a_doxygen_trailing_member_comment : Warning\
>> <
>>    "not a Doxygen trailing comment">, InGroup<Documentation>, DefaultIgnore;
>>  } // end of documentation issue category
>> 
>> +let CategoryName = "Instrumentation Issue" in {
>> +def warn_profile_data_dropped : Warning<
>> +  "profile data mismatch for %0 of the %1 function%s1 in the translation unit"\
>> >,
>> +  InGroup<InstrProfMismatch>;
>> +} // end of instrumentation issue category
>> +
>>  } // end of sema component.
>> 
>
> This still doesn’t say much about what effect the mismatch may have or how to fix it. Maybe something like:
>
> "mismatched profile data ignored for %0 of %1 function%s1; updating
> the profile may improve optimizations”

This should be clearer:

  "profile data may be out of date: of %0 function%s0, %1 %plural{1:has|:have}1
   no data and %2 %plural{1:has|:have}2 mismatched data that will be ignored"

>> diff --git a/lib/CodeGen/CodeGenPGO.cpp b/lib/CodeGen/CodeGenPGO.cpp
>> index 2e562ca..2047302 100644
>> --- a/lib/CodeGen/CodeGenPGO.cpp
>> +++ b/lib/CodeGen/CodeGenPGO.cpp
>> @@ -273,6 +273,12 @@ llvm::GlobalVariable *CodeGenPGO::buildDataVar() {
>>    return Data;
>>  }
>> 
>> +void CodeGenPGO::updateInstrumentationStats(InstrProfStats &Stats) {
>> +  if (HasMismatchedData)
>> +    Stats.MismatchedFunctions++;
>> +  Stats.VisitedFunctions++;
>> +}
>> +
>>  void CodeGenPGO::emitInstrumentationData() {
>>    if (!CGM.getCodeGenOpts().ProfileInstrGenerate)
>>      return;
>> @@ -949,9 +955,12 @@ void CodeGenPGO::loadRegionCounts(PGOProfileData *PGOData)\
>>  {
>>    // hash value based on some characteristics of the input.
>>    RegionCounts.reset(new std::vector<uint64_t>);
>>    uint64_t Hash;
>> -  if (PGOData->getFunctionCounts(getFuncName(), Hash, *RegionCounts) ||
>> -      Hash != FunctionHash || RegionCounts->size() != NumRegionCounters)
>> +  if (PGOData->getFunctionCounts(getFuncName(), Hash, *RegionCounts))
>>      RegionCounts.reset();
>> +  if (Hash != FunctionHash || RegionCounts->size() != NumRegionCounters) {
>> +    HasMismatchedData = true;
>> +    RegionCounts.reset();
>> +  }
>>  }
>> 
>>  void CodeGenPGO::destroyRegionCounters() {
>
> Isn’t there a way to update the stats directly from
> CodeGenPGO::loadRegionCounts? I don’t see why you need to add the new
> updateInstrumentationStats function.

You're right, this is better. It also makes it easier to differentiate
mismatched data from data that's completely missing.

New patch attached.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: mismatch-once-per-tu-2.patch
Type: text/x-patch
Size: 6107 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140410/6dc6c767/attachment.bin>


More information about the cfe-commits mailing list