<div dir="ltr">Some more data on code size. <div><br></div><div>I've build CPU2006/483.xalancbmk with <div><div>a) -O2 -fsanitize=address -m64 -gline-tables-only -mllvm -asan-coverage=1</div><div>b) -O2 -fsanitize=address -m64 -gline-tables-only -fprofile-instr-generate</div>
<div><br></div><div>The first is 27Mb and the second is 48Mb. </div><div><br></div><div>The extra size comes from __llvm_prf_* sections. </div><div>You may be able to make these sections more compact, but you will not make them tiny. </div>
<div><br></div><div>The instrumentation code generated by -asan-coverage=1 is less efficient than -fprofile-instr-generate</div><div>in several ways (slower, fatter, provides less data).</div><div>But it does not add any extra sections to the binary and wins in the overall binary size.</div>
<div>Ideally, I'd like to see such options for -fprofile-instr-generate as well. </div><div><br></div><div>--kcc </div></div></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Feb 22, 2014 at 9:13 AM, Kostya Serebryany <span dir="ltr"><<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Our users combine asan and coverage testing and they do it on thousands machines.<div>(An older blog post about using asan: <a href="http://blog.chromium.org/2012/04/fuzzing-for-security.html" target="_blank">http://blog.chromium.org/2012/04/fuzzing-for-security.html</a>)</div>
<div>The binaries need to be shipped to virtual machines, where they will be run.</div><div>The VMs are *very* short of disk and the network bandwidth has a cost too.</div><div>We may be able to ship stripped binaries to those machine but this will complicate the logic immensely.</div>
<div><br></div><div>Besides, zip-ed binaries are stored for several revisions every day and the storage also costs money.</div><div>Just to give you the taste (<a href="https://commondatastorage.googleapis.com/chromium-browser-asan/index.html" target="_blank">https://commondatastorage.googleapis.com/chromium-browser-asan/index.html</a>): <br>
</div>asan-symbolized-linux-release-252010.zip 2014-02-19 14:34:24 406.35MB<br>asan-symbolized-linux-release-252017.zip 2014-02-19 18:22:54 406.41MB<br>asan-symbolized-linux-release-252025.zip 2014-02-19 21:35:49 406.35MB<br>
asan-symbolized-linux-release-252031.zip 2014-02-20 00:44:25 406.35MB<br>asan-symbolized-linux-release-252160.zip 2014-02-20 06:30:16 406.34MB<br>asan-symbolized-linux-release-252185.zip 2014-02-20 09:21:47 408.52MB<br>asan-symbolized-linux-release-252188.zip 2014-02-20 12:20:05 408.52MB<br>
asan-symbolized-linux-release-252194.zip 2014-02-20 15:01:05 408.52MB<br>asan-symbolized-linux-release-252218.zip 2014-02-20 18:00:42 408.54MB<br>asan-symbolized-linux-release-252265.zip 2014-02-20 21:00:03 408.65MB<br>asan-symbolized-linux-release-252272.zip 2014-02-21 00:00:40 408.66MB<br>
<br><div>--kcc </div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Feb 22, 2014 at 8:58 AM, Bob Wilson <span dir="ltr"><<a href="mailto:bob.wilson@apple.com" target="_blank">bob.wilson@apple.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Why is the binary size a concern for coverage testing?<div><div><div><br><div>
<div>On Feb 21, 2014, at 8:43 PM, Kostya Serebryany <<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>> wrote:</div><br><blockquote type="cite"><div dir="ltr">I understand why you don't want to rely on debug info and instead produce your own section. <div>
We did this with our early version of llvm-based tsan and it was simpler to implement. <br><div>But here is a data point to support my suggestion: </div>
<div>chromium binary built with asan, coverage and -gline-tables-only is 1.6Gb.</div><div>The same binary is 1.1Gb when stripped, so, the line tables require 500Mb. </div><div>Separate line info for coverage will essentially double this amount. </div>
<div>The size of binary is a serious concern for our users, please take it into consideration. </div><div><br></div><div>Thanks! </div><div>--kcc <br><div><br></div></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">
On Fri, Feb 21, 2014 at 8:28 PM, Bob Wilson <span dir="ltr"><<a href="mailto:bob.wilson@apple.com" target="_blank">bob.wilson@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">We’re not going to use debug info at all. We’re emitting the counters in the clang front-end. We just need to emit separate info to show how to map those counters to source locations. Mapping to PCs and then using debug info to get from the PCs to the source locations just makes things harder and loses information in the process.<div>
<div><br><div><div>On Feb 21, 2014, at 2:57 AM, Kostya Serebryany <<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>> wrote:</div><br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra">
<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div><br>
<br>
</div>We may need some additional info.</blockquote><div>What kind of additional info? </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
I haven't put a ton of thought into<br>
this, but I'm hoping we can either (a) use debug info as is or add some<br>
extra (valid) debug info to support this, or (b) add an extra<br>
debug-info-like section to instrumented binaries with the information we<br>
need.<br></blockquote><div><br></div><div>I'd try this data format (binary equivalent): </div><div><br></div><div>/path/to/binary/or/dso1 num_counters1</div><div>pc1 counter1</div><div>pc2 counter2<br></div><div>pc3 counter3<br>
</div><div>...</div><div><div>/path/to/binary/or/dso2 num_counters2</div><div>pc1 counter1</div><div>pc2 counter2<br></div><div>pc3 counter3<br></div></div><div>...</div><div><br></div><div>I don't see a straightforward way to produce such data today because individual Instructions do not work as labels.<br>
</div><div>But I think this can be supported in LLVM codegen. </div><div>Here is a *raw* patch with comments, just to get the idea.</div><div><br></div><div><br></div><div><div>Index: lib/CodeGen/CodeGenPGO.cpp</div><div>
===================================================================</div><div>--- lib/CodeGen/CodeGenPGO.cpp (revision 201843)</div><div>+++ lib/CodeGen/CodeGenPGO.cpp (working copy)</div><div>@@ -199,7 +199,8 @@</div>
<div>
llvm::Type *Args[] = {</div><div> Int8PtrTy, // const char *MangledName</div><div> Int32Ty, // uint32_t NumCounters</div><div>- Int64PtrTy // uint64_t *Counters</div>
<div>+ Int64PtrTy, // uint64_t *Counters</div><div>+ Int64PtrTy // uint64_t *PCs</div><div> };</div><div> llvm::FunctionType *FTy =</div><div> llvm::FunctionType::get(PGOBuilder.getVoidTy(), Args, false);</div>
<div>@@ -209,9 +210,10 @@</div><div> llvm::Constant *MangledName =</div><div> CGM.GetAddrOfConstantCString(CGM.getMangledName(GD), "__llvm_pgo_name");</div><div> MangledName = llvm::ConstantExpr::getBitCast(MangledName, Int8PtrTy);</div>
<div>- PGOBuilder.CreateCall3(EmitFunc, MangledName,</div><div>+ PGOBuilder.CreateCall4(EmitFunc, MangledName,</div><div> PGOBuilder.getInt32(NumRegionCounters),</div><div>- PGOBuilder.CreateBitCast(RegionCounters, Int64PtrTy));</div>
<div>+ PGOBuilder.CreateBitCast(RegionCounters, Int64PtrTy),</div><div>+ PGOBuilder.CreateBitCast(RegionPCs, Int64PtrTy));</div><div> }</div><div> </div><div> llvm::Function *CodeGenPGO::emitInitialization(CodeGenModule &CGM) {</div>
<div>@@ -769,6 +771,13 @@</div><div> llvm::GlobalVariable::PrivateLinkage,</div><div> llvm::Constant::getNullValue(CounterTy),</div><div> "__llvm_pgo_ctr");</div>
<div>+</div><div>+ RegionPCs =</div><div>+ new llvm::GlobalVariable(CGM.getModule(), CounterTy, false,</div><div>+ llvm::GlobalVariable::PrivateLinkage,</div><div>+ llvm::Constant::getNullValue(CounterTy),</div>
<div>+ "__llvm_pgo_pcs");</div><div>+</div><div> }</div><div> </div><div> void CodeGenPGO::emitCounterIncrement(CGBuilderTy &Builder, unsigned Counter) {</div><div>@@ -779,6 +788,21 @@</div>
<div> llvm::Value *Count = Builder.CreateLoad(Addr, "pgocount");</div><div> Count = Builder.CreateAdd(Count, Builder.getInt64(1));</div><div> Builder.CreateStore(Count, Addr);</div><div>+ // We should put the PC of the instruction that increments __llvm_pgo_ctr</div>
<div>+ // into __llvm_pgo_pcs, which will be passed to llvm_pgo_emit.</div><div>+ // This patch is wrong in many ways:</div><div>+ // * We pass the PC of the Function instead of the PC of the Instruction,</div><div>
+ // because the latter doesn't work like this. We'll need to support</div>
<div>+ // Instructions as labels in LLVM codegen.</div><div>+ // * We actually store the PC on each increment, while we should initialize</div><div>+ // this array at link time (need to refactor this code a bit).</div>
<div>+ //</div><div>+ Builder.CreateStore(</div><div>+ Builder.CreatePointerCast(</div><div>+ cast<llvm::Instruction>(Count)->getParent()->getParent(),</div><div>+ Builder.getInt64Ty() // FIXME: use a better type</div>
<div>+ ),</div><div>+ Builder.CreateConstInBoundsGEP2_64(RegionPCs, 0, Counter));</div><div> }</div><div> </div></div><div><div>Index: lib/CodeGen/CodeGenPGO.h</div><div>===================================================================</div>
<div>--- lib/CodeGen/CodeGenPGO.h (revision 201843)</div><div>+++ lib/CodeGen/CodeGenPGO.h (working copy)</div><div>@@ -59,6 +59,7 @@</div><div> </div><div> unsigned NumRegionCounters;</div><div> llvm::GlobalVariable *RegionCounters;</div>
<div>+ llvm::GlobalVariable *RegionPCs;</div><div> llvm::DenseMap<const Stmt*, unsigned> *RegionCounterMap;</div><div> llvm::DenseMap<const Stmt*, uint64_t> *StmtCountMap;</div><div> std::vector<uint64_t> *RegionCounts;</div>
<div>@@ -66,8 +67,9 @@</div><div> </div><div> public:</div><div> CodeGenPGO(CodeGenModule &CGM)</div><div>- : CGM(CGM), NumRegionCounters(0), RegionCounters(0), RegionCounterMap(0),</div><div>- StmtCountMap(0), RegionCounts(0), CurrentRegionCount(0) {}</div>
<div>+ : CGM(CGM), NumRegionCounters(0), RegionCounters(0), RegionPCs(0),</div><div>+ RegionCounterMap(0), StmtCountMap(0), RegionCounts(0),</div><div>+ CurrentRegionCount(0) {}</div><div> ~CodeGenPGO() {}</div>
<div> </div><div> /// Whether or not we have PGO region data for the current function. This is</div></div><div><br></div><div><br></div><div><br></div><div><br></div><div> </div></div><br></div></div>
</blockquote></div><br></div></div></div></blockquote></div><br></div>
</blockquote></div><br></div></div></div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>