[PATCH] D112042: [llvm-profgen] Skip duplication factor outside of body sample computation

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 19 16:22:22 PDT 2021


wenlei added a comment.

> The total count straight from create_llvm_prof is always garbage.

I didn’t realize that until now..

> The reason for this is that create_llvm_prof does not disassemble the instruction.  It just uses address range to accumulate the counters.

Yes, that’s one source of over counting in create_llvm_prof, but the issue here is different, which is simply a bug in using duplication factor. Also note that llvm-profgen does disassemble instructions and calculates total count properly.

> I had some discussions with Wei about this. But we think the function's total count is not really used in the compiler.

Well.. sample loader does look at total count for inline decisions. See code link below.

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/SampleProfile.cpp#L1082
->
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/SampleProfileLoaderBaseUtil.cpp#L62

> I found this when I worked on FSAFDO. If I want to get good function total counts, I always do another merge of the profile (profile_merger) -- this way the entry count will be recomputed and it's the sum of the counter inside.

Is profile_merger a separate tool you used internally? Wasn’t aware of it..

> Our production build uses another pipeline (it has a disassembler) and does not suffer this issue.

Okay, good to know. I thought create_llvm_prof is used in your production pipeline..

Thanks,
Wenlei

From: Rong Xu <xur at google.com>
Date: Tuesday, October 19, 2021 at 2:03 PM
To: Wenlei He <wenlei at fb.com>
Cc: Wenlei He <reviews+D112042 <https://reviews.llvm.org/D112042>+public+c17afeb67dc591e4 at reviews.llvm.org>, Hongtao Yu <hoy at fb.com>, Lei Wang <wlei at fb.com>, davidxl at google.com <davidxl at google.com>, rajeshwarv at google.com <rajeshwarv at google.com>, lxfind at gmail.com <lxfind at gmail.com>, Modi Mo <modimo at fb.com>
Subject: Re: [PATCH] D112042 <https://reviews.llvm.org/D112042>: [llvm-profgen] Skip duplication factor outside of body sample computation
Sorry. I missed that part.

The total count straight from create_llvm_prof is always garbage.
The reason for this is that create_llvm_prof does not disassemble the instruction.  It just uses address range to accumulate the counters.
 I found this when I worked on FSAFDO. If I want to get good function total counts, I always do another merge of the profile (profile_merger) -- this way the entry count will be recomputed and it's the sum of the counter inside.
I had some discussions with Wei about this. But we think the function's total count is not really used in the compiler.
I did some experiments and did not find a performance difference with fixed function total counts.

Our production build uses another pipeline (it has a disassembler) and does not suffer this issue.

Overconting is unavoidable here since there is no basic block (all the instructions with different offset will have a count).

-Rong


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112042/new/

https://reviews.llvm.org/D112042



More information about the llvm-commits mailing list