[PATCH] D91878: [CSSPGO] Pseudo probe encoding and emission.

Tue Dec 8 15:23:41 PST 2020

hoy added a comment.

In D91878#2440875 <https://reviews.llvm.org/D91878#2440875>, @wmi wrote:

> Sorry for the delay in review. Some general questions.
>
> I remember Wenlei mentioned in the RFC discussion that the overall binary size increase from pseudo probe is ~12% if CFG encoding is added. Is the size increase for object file from pseudo probe about the same as binary? I am asking because the total size of linker input matters for us. Previously we may use fission to separate the debug information from object file so it won't be accounted as linker input.
>
> According to the description, The address range will be split into multiple ranges by the address of pseudo probe. If some BB somehow doesn't have block type pseudo probe, the addresses in the BB will be mistakenly associated with the pseudo probe located before it in the binary. I don't know whether you saw cases like that. A random thought is we may generate some fake pseudo probes for the BBs which don't have them, as a mark of the BB boundary.

Thanks for reviewing this patch!

Regarding the binary size, an object file will likely have the same size inflation with an executable since the size of the pseudo probe sections are propositional to the text section. If that is an issue to the linker, we can also perform fission to the pseudo probe data. Pseudo probe encoding is currently done on function (comdat) level (see `MCPseudoProbeSection::Emit` in MCPseudoProbe.cpp`), which means a function always gets a separate real absolute probe encoded in the first place followed by relative address delta encoded for subsequent probes. Therefore fission on function level should be fine. Fission on block level won't work, as you pointed out. Would function-level fission work for you?

================
Comment at: llvm/lib/CodeGen/AsmPrinter/PseudoProbePrinter.cpp:51
+  SmallVector<InlineSite, 8> InlineStack;
+  // When it's done InlineStack look like ([66, B], [88, A])
+  // which means, Function A inlines function B at line 88, and B inlines C at
----------------
wmi wrote:
> Is the comment up-to-date? Looks like the InlineStack only contains guid and probeid?
Good catch. The numbers should refer to probe id.

================
Comment at: llvm/lib/MC/MCAsmStreamer.cpp:2135-2136
+     << Attr;
+  // Emit inline stack like
+  //  @ GUIDDirectCaller:11 @ GUIDCaller:1 @ GUIDmain:3
+  if (!InlineStack.empty()) {
----------------
wmi wrote:
> It is a reverse of sequence as the context string in the profile. Is it intentional? 
Yes, it's intentional. We feel that it's easier to read when the inline context gets long. The decoding tool (https://reviews.llvm.org/D92334) also prints in the reverse order. What do you think? Would you perfer the profile context order? 

================
Comment at: llvm/lib/MC/MCAsmStreamer.cpp:2137
+  //  @ GUIDDirectCaller:11 @ GUIDCaller:1 @ GUIDmain:3
+  if (!InlineStack.empty()) {
+    for (auto &Site : InlineStack) {
----------------
wmi wrote:
> Is it unnecessary?
Good catch. Will remove it.

================
Comment at: llvm/lib/MC/MCPseudoProbe.cpp:92-95
+  // When it comes here, the input look like:
+  //    Probe: GUID of C, ...
+  //    InlineStack: [66, B], [88, A]
+  // which means, Function A inlines function B at line 88, and B inlines C at
----------------
wmi wrote:
> I think inline stack only contains guids and pseudo probe ids but here the example mentions line numbers in the stack. I understand probe id is extracted from the discriminator in the debug Location but it will be confusing if people don't know about it. 
Right, the comment is out-dated. Will replace line numbers with probe ids.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91878/new/

https://reviews.llvm.org/D91878