<div dir="ltr"><span id="gmail-docs-internal-guid-8610d975-7fff-717f-902c-50653de64496"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">This RFC describes the IR metadata format for the sanitizer-based heap profiler (MemProf) data when it is fed back into a subsequent compile for profile guided heap optimization. Additional background can be found in the following RFCs:</span></p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">RFC: Sanitizer-based Heap Profiler [1]</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">RFC: A binary serialization format for MemProf [2]</span></p></li></ul><div><font color="#000000" face="Arial"><span style="font-size:14.6667px;white-space:pre-wrap"><br></span></font></div><div><span id="gmail-docs-internal-guid-2c33e12a-7fff-fdc2-328a-57b9d72e8a87"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We look forward to your feedback.</span></p><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Authors: </span><a href="mailto:tejohnson@google.com" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">tejohnson@google.com</span></a><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">, </span><a href="mailto:snehasishk@google.com" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">snehasishk@google.com</span></a><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">, </span><a href="mailto:davidxl@google.com" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">davidxl@google.com</span></a></span></div><h1 dir="ltr" style="line-height:1.38;margin-top:20pt;margin-bottom:6pt"><span style="font-size:20pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Requirements</span></h1><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The profile format will need to be reasonably compact within IR, but also facilitate efficiently identifying callsites within the stack contexts for use in interprocedural (and cross module) optimization for context disambiguation. These optimizations will require transformations to disambiguate the context at a particular allocation, which will require call graph based analysis and optimization.</span></p><h1 dir="ltr" style="line-height:1.38;margin-top:20pt;margin-bottom:6pt"><span style="font-size:20pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Input Profile Format</span></h1><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The profile will be in an index binary format, which is detailed in [2]. In the index format, the profiles for each allocation site will be located with the profile data for the function containing the allocation call site. Each allocation may have multiple profile entries (known as a MIB, or Memory Info Block) uniquely identified by stack context. The entries in the stack context will be symbolized and include file, line and discriminator.</span></p><h1 dir="ltr" style="line-height:1.38;margin-top:20pt;margin-bottom:6pt"><span style="font-size:20pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Metadata Format</span></h1><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Similar to branch weights and value profile data from regular PGO, the PGHO profile will be annotated as metadata onto relevant instructions. A natural instruction to attach the profile metadata is on the allocation callsite, so these allocation calls can be identified and handled by the subsequent heap optimization pass. As an example, this profile data can be used to enable automatic application of hot and cold hints to allocations, for use by a runtime allocator such as tcmalloc, where support for such allocation hints was recently added [3].</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">However, in order to identify ancestor callsites within an allocation’s call stack context that require modification for disambiguating the context at the allocation site, e.g. via cloning, we will also want to attach metadata to these callsites. This is particularly important for contexts that cross module boundaries, so that we can identify them in ThinLTO summaries for cross module coordination of context transformations.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">To identify and correlate entries in a context, we will use a unique identifier for each stack entry. Specifically, we will use the 64-bit value from the stack entry table in the indexed profile format which is formed from the index into the file path table along with the line and discriminator. Another option would be to represent the stack entries using existing debug metadata. However, for stack entries in another module we would need to synthesize additional debug location metadata in the module containing MIB profile data that references that stack context entry.</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Assume the following working example. For simplicity, all are shown as being in the same module, however, these function definitions could theoretically be located in multiple different modules.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">x.cc</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  1 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">main() {</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  2 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">   foo();  // stack entry id: 123</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  3 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">}</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  4 </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  5 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">foo() {</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  6 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">   baz();  // stack entry id: 234</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  7 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">}  </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  8 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">      </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  9 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">baz() {</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 10 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">   if (x)</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 11 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">      bar();  // stack entry id: 345</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 12 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">   else</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 13 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">      bar();  // stack entry id: 456</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 14 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">}     </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 15 </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 16 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">bar() {</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 17 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">   malloc(4);  // stack entry id: 567</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(193,101,28);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> 18 </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">}  </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The call to malloc has 2 possible calling contexts:</span></p><ol style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:decimal;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">main -> foo (x.cc:2) -> baz (x.cc:6) -> bar (x.cc:11) -> malloc (x.cc:17)</span></p></li><li dir="ltr" style="list-style-type:decimal;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">main -> foo (x.cc:2) -> baz (x.cc:6) -> bar (x.cc:13) -> malloc (x.cc:17)</span></p></li></ol><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">where the stack entry id for each callsite, taken from the profile’s stack entry table contents, is shown in the code comments. The corresponding full contexts in terms of stack entry ids, listed from the leaf allocation callsite up to the root are:</span></p><ol style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:decimal;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">567, 345, 234, 123</span></p></li><li dir="ltr" style="list-style-type:decimal;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">567, 456, 234, 123</span></p></li></ol><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Assuming both contexts execute at runtime, the allocation will end up with 2 MIBs in the profile, one for each of the above contexts.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">To represent this in the IR, we propose 2 new metadata attachment types, as described below.</span></p><h2 dir="ltr" style="line-height:1.38;margin-top:18pt;margin-bottom:6pt"><span style="font-size:16pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Callsite metadata</span></h2><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata is used to associate a callsite with its corresponding references in MIB stack contexts. It contains the associated 64-bit stack entry table value for that callsite from the indexed profile, and is initially only on non-allocation callsites. As will be described later, after inlining it can contain multiple entry ids or be propagated onto allocation callsites.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">In the above example, for the call to foo(), which had stack entry id 123, the IR callsite would be decorated with a </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata containing that stack entry id:</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  tail call void @_Z3foov(), !dbg !12, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite !14</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!14 = !{i64 123}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Note that this call may be in a different module initially than the referencing MIB metadata. In order to disambiguate the context across modules, some form of LTO would be required. ThinLTO summary support will be added to reflect the cross-module contexts and enable cross module optimization of the contexts.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Also, while for MemProf the ids will be assigned uniquely using information from the MemProf profile, other types of context sensitive profiles could simply reuse the same id after matching with line table information, or at least leverage the same metadata attachment to assign their own unique ids if there is no MemProf profile.</span></p><h2 dir="ltr" style="line-height:1.38;margin-top:18pt;margin-bottom:6pt"><span style="font-size:16pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Memprof metadata</span></h2><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata describes the MIBs for the leaf allocation callsite it is attached to. If there are multiple stack contexts leading to that allocation, it will have a single </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata attachment, with a level of indirection used to list all related MIB, as shown in the later example.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">As with the indexed profile format, we need to be able to add or modify fields of the MIB entries while maintaining backwards compatibility with older bitcode. Therefore, we use a schema format with the MIB profile entry fields described by a “Memprof Schema” module level metadata, for example:</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!llvm.module.flags = !{!1}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!1 = !{i32 1, !”Memprof Schema”,!”Stack”, !”AllocCount”, !”AveSize”, !”MinSize”, !”MaxSize”, !”AveAccessCount”, !”MinAccessCount”, !”MaxAccessCount”, !”AveLifetime”, !”MinLifetime”, !”MaxLifetime”, !”NumMigration”, !”NumLifetimeOverlaps” }</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The first (merge behavior) field is 1 (ModFlagBehavior::Error), meaning that it is an error to merge modules with different values, or in other words, merging modules compiled with different profiles generated with different versions of the indexed profile format.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Assume we are using the schema shown in the above module flag metadata. In the earlier example, where allocation was reached by 2 different profile contexts (and therefore had 2 MIBs), the </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata would be structured similar to the following:</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  %call = tail call noalias align 16 dereferenceable_or_null(4) i8* @malloc(i64 4) #4, !dbg !261, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof !13</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!13 = !{!14, !16}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!14 = !{!15, i32 2, i32 4, i32 4, i32 4, i64 10, i64 5, i64 15, i32 30, i32 20, i32 40, i32 0, i32 0}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!16 = !{!17, i32 1, i32 4, i32 4, i32 4, i64 5, i64 1, i64 10, i32 10, i32 10, i32 10, i32 0, i32 0}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The first memprof metadata entry references another metadata containing the call stack context, which contains entries that as described earlier are unique 64-bit values used to identify stack entries in the indexed profile. These stack entry ids enable correlation with the </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata attached to calls, particularly across modules. Note that we don’t need to include a stack entry id for the allocation callsite, since the metadata is already correlated with it by attachment. Therefore, the contexts for the two MIB shown here would contain the list of ids described for the example, minus 567 for the allocation callsite. So “345, 234, 123” and “456, 234, 123”.</span></p><br><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">This call stack metadata (e.g. !15 and !17 in the above example), should identify the stack entries in order from the leaf allocation’s caller to the root caller. There are several options for organizing this metadata. One simple possibility is to list all stack entry ids exhaustively in each call stack. For example:</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!15 = !{i64 456, i64 234, i64 123}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!17 = !{i64 345, i64 234, i64 123}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">However, as can seen by the above example, this results in a lot of duplication. Instead, we will organize the call stack metadata as a chain of metadata, one stack entry id per metadata, with a second operand pointing to the next (caller) stack entry in the chain. This allows for deduplication in the above case, which then looks like:</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!15 = !{i64 456, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!17 = !{i64 345, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!18 = !{i64 234, !19}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!19 = !{i64 123}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">While in this toy example the chaining results in overall more metadata instructions and operands, in reality the contexts are much longer, and as we will show later, the opportunities to deduplicate stack entries is large in practice.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We may not be able to completely deduplicate all instances of a particular stack entry id. If we additionally had a call stack with entries 567, 234, 678, 123, although 234 is the same id shown in the earlier call stacks, it has a different caller (678) so we cannot share the same metadata node used for 234 in the above metadata. We will instead need to essentially duplicate it, like:</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!20 = !{i64 567, !21}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!21 = !{i64 234, !22}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!22 = !{i64 678, !19}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The last entry above can and does point to the same (root stack entry) metadata !19 used for entry 123 by the earlier call stacks.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">A measurement of the effectiveness of this deduplication strategy for 3 large internal applications showed we could reduce the number of stack entry ids by 75-88%. This was measured across the entire profile, so the deduplication effectiveness will be smaller within each IR module, where there are only a subset of call stacks with which to deduplicate. Additionally, the deduplication strategy requires additional Metadata nodes and operands for the caller Metadata pointers. Considering these effects leads to estimates of 12-60% size overhead reductions, which has a smaller lower bound but wider distribution. However, the measurements suggest this strategy will lead to a net reduction in overhead.</span></p><h2 dir="ltr" style="line-height:1.38;margin-top:18pt;margin-bottom:6pt"><span style="font-size:16pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Inlining</span></h2><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">When calls are inlined after annotation, the relevant </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata can be merged, with the ordering of the entries metadata implying the inlined callee->caller relationships, and if relevant added onto the inlined allocation callsite (which will then have both </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> and </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">For example, assuming we have the following IR snippets for the original example (for simplicity, all in the same Module):</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">define dso_local i32 @main() local_unnamed_addr #0 !dbg !80 { …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  tail call void @_Z3foov(), !dbg !121, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite !1</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">define dso_local void @_Z3foov() local_unnamed_addr #0 !dbg !81 { …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  tail call void @_Z3bazv(), !dbg !111, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite !2</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">define dso_local void @_Z3bazv() local_unnamed_addr #0 !dbg !82 { …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  ; if (x)</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  tail call void @_Z3barv(), !dbg !11, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite !3</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  ; else</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  tail call void @_Z3barv(), !dbg !12, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite !4</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">define dso_local void @_Z3barv() local_unnamed_addr #0 !dbg !258 { …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  %call = tail call noalias align 16 dereferenceable_or_null(4) i8* @malloc(i64 4) #4, !dbg !261, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof !13</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!1 = !{i64 123}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!2 = !{i64 234}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!3 = !{i64 345}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!4 = !{i64 456}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!13 = !{!14, !16}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!14 = !{!15, i32 2, i32 4, i32 4, i32 4, i64 10, i64 5, i64 15, i32 30, i32 20, i32 40, i32 0, i32 0}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!15 = !{i64 456, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!16 = !{!17, i32 1, i32 4, i32 4, i32 4, i64 5, i64 1, i64 10, i32 10, i32 10, i32 10, i32 0, i32 0}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!17 = !{i64 345, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!18 = !{i64 234, !19}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!19 = !{i64 123}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">If we inlined bar() into both calls to baz(), and then into foo() and from there into main() - i.e. all the way up to the root of the call stack - the resulting allocation calls which are eventually inlined into main() would look like:</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">define dso_local i32 @main() local_unnamed_addr #0 !dbg !8 { … </span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  ; if (x)</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  %call = tail call noalias align 16 dereferenceable_or_null(4) i8* @malloc(i64 4) #4, !dbg !26, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof !16, !callsite !17</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  ; else</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  %call = tail call noalias align 16 dereferenceable_or_null(4) i8* @malloc(i64 4) #4, !dbg !27, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof !14, !callsite !15</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">… </span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!14 = !{!15, i32 2, i32 4, i32 4, i32 4, i64 10, i64 5, i64 15, i32 30, i32 20, i32 40, i32 0, i32 0}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!15 = !{i64 456, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!16 = !{!17, i32 1, i32 4, i32 4, i32 4, i64 5, i64 1, i64 10, i32 10, i32 10, i32 10, i32 0, i32 0}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!17 = !{i64 345, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!18 = !{i64 234, !19}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!19 = !{i64 123}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">In the above case since we inlined all the way up to root main, the merged </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata on each inlined allocation callsite is the same as the list in the “Stack” metadata on the associated </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata, so we can reuse them (</span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!17 </span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">and</span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> !15</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">) on the inlined allocation </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadatas.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Note that the original allocation had multiple memprof metadata corresponding to different stack contexts. When we inline, we should prune those from the inlined allocation call that do not start with the stack subsequence described in the concatenated stack entry profile metadata. In the above example, the inlined callsites have contexts implied by their new </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata that match exactly one of the original MIBs, and we prune the other. If we had not inlined all the way up into the root main, we might have multiple MIB entries whose stack contexts include the inlined stack entry sequence in the </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata and they should all be kept on the inlined allocation callsite.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">As another example, if we only inlined bar into baz the inlined allocations in baz would look like:</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">define dso_local void @_Z3bazv() local_unnamed_addr #0 !dbg !82 { …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  ; if (x)</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  %call = tail call noalias align 16 dereferenceable_or_null(4) i8* @malloc(i64 4) #4, !dbg !26, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof !16, !callsite !3</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  ; else</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  %call = tail call noalias align 16 dereferenceable_or_null(4) i8* @malloc(i64 4) #4, !dbg !27, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!memprof !14, !callsite !4</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  …</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!3 = !{i64 345}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!4 = !{i64 456}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!13 = !{!14, !16}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!14 = !{!15, i32 2, i32 4, i32 4, i32 4, i64 10, i64 5, i64 15, i32 30, i32 20, i32 40, i32 0, i32 0}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!15 = !{i64 456, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!16 = !{!17, i32 1, i32 4, i32 4, i32 4, i64 5, i64 1, i64 10, i32 10, i32 10, i32 10, i32 0, i32 0}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!17 = !{i64 345, !18}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!18 = !{i64 234, !19}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!19 = !{i64 123}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">In this case the </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata on the inlined allocations is the same as that on the original calls to bar(), i.e. a single stack entry id each, since we haven’t inlined across multiple calls with </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata. But we can still prune the MIB with stack contexts that don’t start with that of the </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata on the inlined call.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">And alternatively if we only inlined foo into main, main’s call of bar would now have </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata that has a child pointer to the </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata of the now inlined call to foo() (</span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!1</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">):</span></p><br><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">define dso_local i32 @main() local_unnamed_addr #0 !dbg !8 {</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">… </span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">  tail call void @_Z3barv(), !dbg !11, </span><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite !2</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">… </span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!1 = !{i64 123}</span></p><p dir="ltr" style="line-height:1.2;margin-top:0pt;margin-bottom:0pt;padding:0pt 0pt 0pt 13.5pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(255,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!2 = !{i64 234, !1}</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Keeping the </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">!callsite</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> metadata from the inlined callsites and concatenating them in an upward call stack order will be important when we later determine what type of cloning or other context disambiguating optimizations are required.</span></p><br><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">[1] RFC: Sanitizer-based Heap Profiler (</span><a href="https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html</span></a><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">)</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">[2] RFC: A binary serialization format for MemProf (</span><a href="https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html</span></a><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">)</span></p><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">[3] Implement interfaces for providing access frequency hints to TCMalloc (</span><a href="https://github.com/google/tcmalloc/commit/ab87cf382dc56784f783f3aaa43d6d0465d5f385" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">https://github.com/google/tcmalloc/commit/ab87cf382dc56784f783f3aaa43d6d0465d5f385</span></a><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">)</span></span><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><span style="font-family:Times;font-size:medium"><table cellspacing="0" cellpadding="0"><tbody><tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small"><td nowrap style="border-top:2px solid rgb(213,15,37)">Teresa Johnson |</td><td nowrap style="border-top:2px solid rgb(51,105,232)"> Software Engineer |</td><td nowrap style="border-top:2px solid rgb(0,153,57)"> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td><td nowrap style="border-top:2px solid rgb(238,178,17)"><br></td></tr></tbody></table></span></div></div></div></div>