[LLVMdev] asan coverage
Kostya Serebryany
kcc at google.com
Fri Feb 21 21:13:19 PST 2014
Our users combine asan and coverage testing and they do it on thousands
machines.
(An older blog post about using asan:
http://blog.chromium.org/2012/04/fuzzing-for-security.html)
The binaries need to be shipped to virtual machines, where they will be run.
The VMs are *very* short of disk and the network bandwidth has a cost too.
We may be able to ship stripped binaries to those machine but this will
complicate the logic immensely.
Besides, zip-ed binaries are stored for several revisions every day and the
storage also costs money.
Just to give you the taste (
https://commondatastorage.googleapis.com/chromium-browser-asan/index.html):
asan-symbolized-linux-release-252010.zip 2014-02-19 14:34:24 406.35MB
asan-symbolized-linux-release-252017.zip 2014-02-19 18:22:54 406.41MB
asan-symbolized-linux-release-252025.zip 2014-02-19 21:35:49 406.35MB
asan-symbolized-linux-release-252031.zip 2014-02-20 00:44:25 406.35MB
asan-symbolized-linux-release-252160.zip 2014-02-20 06:30:16 406.34MB
asan-symbolized-linux-release-252185.zip 2014-02-20 09:21:47 408.52MB
asan-symbolized-linux-release-252188.zip 2014-02-20 12:20:05 408.52MB
asan-symbolized-linux-release-252194.zip 2014-02-20 15:01:05 408.52MB
asan-symbolized-linux-release-252218.zip 2014-02-20 18:00:42 408.54MB
asan-symbolized-linux-release-252265.zip 2014-02-20 21:00:03 408.65MB
asan-symbolized-linux-release-252272.zip 2014-02-21 00:00:40 408.66MB
--kcc
On Sat, Feb 22, 2014 at 8:58 AM, Bob Wilson <bob.wilson at apple.com> wrote:
> Why is the binary size a concern for coverage testing?
>
> On Feb 21, 2014, at 8:43 PM, Kostya Serebryany <kcc at google.com> wrote:
>
> I understand why you don't want to rely on debug info and instead produce
> your own section.
> We did this with our early version of llvm-based tsan and it was simpler
> to implement.
> But here is a data point to support my suggestion:
> chromium binary built with asan, coverage and -gline-tables-only is 1.6Gb.
> The same binary is 1.1Gb when stripped, so, the line tables require 500Mb.
> Separate line info for coverage will essentially double this amount.
> The size of binary is a serious concern for our users, please take it into
> consideration.
>
> Thanks!
> --kcc
>
>
>
> On Fri, Feb 21, 2014 at 8:28 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>
>> We’re not going to use debug info at all. We’re emitting the counters in
>> the clang front-end. We just need to emit separate info to show how to map
>> those counters to source locations. Mapping to PCs and then using debug
>> info to get from the PCs to the source locations just makes things harder
>> and loses information in the process.
>>
>> On Feb 21, 2014, at 2:57 AM, Kostya Serebryany <kcc at google.com> wrote:
>>
>>
>>>
>>> We may need some additional info.
>>
>> What kind of additional info?
>>
>>
>>> I haven't put a ton of thought into
>>> this, but I'm hoping we can either (a) use debug info as is or add some
>>> extra (valid) debug info to support this, or (b) add an extra
>>> debug-info-like section to instrumented binaries with the information we
>>> need.
>>>
>>
>> I'd try this data format (binary equivalent):
>>
>> /path/to/binary/or/dso1 num_counters1
>> pc1 counter1
>> pc2 counter2
>> pc3 counter3
>> ...
>> /path/to/binary/or/dso2 num_counters2
>> pc1 counter1
>> pc2 counter2
>> pc3 counter3
>> ...
>>
>> I don't see a straightforward way to produce such data today because
>> individual Instructions do not work as labels.
>> But I think this can be supported in LLVM codegen.
>> Here is a *raw* patch with comments, just to get the idea.
>>
>>
>> Index: lib/CodeGen/CodeGenPGO.cpp
>> ===================================================================
>> --- lib/CodeGen/CodeGenPGO.cpp (revision 201843)
>> +++ lib/CodeGen/CodeGenPGO.cpp (working copy)
>> @@ -199,7 +199,8 @@
>> llvm::Type *Args[] = {
>> Int8PtrTy, // const char *MangledName
>> Int32Ty, // uint32_t NumCounters
>> - Int64PtrTy // uint64_t *Counters
>> + Int64PtrTy, // uint64_t *Counters
>> + Int64PtrTy // uint64_t *PCs
>> };
>> llvm::FunctionType *FTy =
>> llvm::FunctionType::get(PGOBuilder.getVoidTy(), Args, false);
>> @@ -209,9 +210,10 @@
>> llvm::Constant *MangledName =
>> CGM.GetAddrOfConstantCString(CGM.getMangledName(GD),
>> "__llvm_pgo_name");
>> MangledName = llvm::ConstantExpr::getBitCast(MangledName, Int8PtrTy);
>> - PGOBuilder.CreateCall3(EmitFunc, MangledName,
>> + PGOBuilder.CreateCall4(EmitFunc, MangledName,
>> PGOBuilder.getInt32(NumRegionCounters),
>> - PGOBuilder.CreateBitCast(RegionCounters,
>> Int64PtrTy));
>> + PGOBuilder.CreateBitCast(RegionCounters,
>> Int64PtrTy),
>> + PGOBuilder.CreateBitCast(RegionPCs,
>> Int64PtrTy));
>> }
>>
>> llvm::Function *CodeGenPGO::emitInitialization(CodeGenModule &CGM) {
>> @@ -769,6 +771,13 @@
>> llvm::GlobalVariable::PrivateLinkage,
>> llvm::Constant::getNullValue(CounterTy),
>> "__llvm_pgo_ctr");
>> +
>> + RegionPCs =
>> + new llvm::GlobalVariable(CGM.getModule(), CounterTy, false,
>> + llvm::GlobalVariable::PrivateLinkage,
>> + llvm::Constant::getNullValue(CounterTy),
>> + "__llvm_pgo_pcs");
>> +
>> }
>>
>> void CodeGenPGO::emitCounterIncrement(CGBuilderTy &Builder, unsigned
>> Counter) {
>> @@ -779,6 +788,21 @@
>> llvm::Value *Count = Builder.CreateLoad(Addr, "pgocount");
>> Count = Builder.CreateAdd(Count, Builder.getInt64(1));
>> Builder.CreateStore(Count, Addr);
>> + // We should put the PC of the instruction that increments
>> __llvm_pgo_ctr
>> + // into __llvm_pgo_pcs, which will be passed to llvm_pgo_emit.
>> + // This patch is wrong in many ways:
>> + // * We pass the PC of the Function instead of the PC of the
>> Instruction,
>> + // because the latter doesn't work like this. We'll need to support
>> + // Instructions as labels in LLVM codegen.
>> + // * We actually store the PC on each increment, while we should
>> initialize
>> + // this array at link time (need to refactor this code a bit).
>> + //
>> + Builder.CreateStore(
>> + Builder.CreatePointerCast(
>> + cast<llvm::Instruction>(Count)->getParent()->getParent(),
>> + Builder.getInt64Ty() // FIXME: use a better type
>> + ),
>> + Builder.CreateConstInBoundsGEP2_64(RegionPCs, 0, Counter));
>> }
>>
>> Index: lib/CodeGen/CodeGenPGO.h
>> ===================================================================
>> --- lib/CodeGen/CodeGenPGO.h (revision 201843)
>> +++ lib/CodeGen/CodeGenPGO.h (working copy)
>> @@ -59,6 +59,7 @@
>>
>> unsigned NumRegionCounters;
>> llvm::GlobalVariable *RegionCounters;
>> + llvm::GlobalVariable *RegionPCs;
>> llvm::DenseMap<const Stmt*, unsigned> *RegionCounterMap;
>> llvm::DenseMap<const Stmt*, uint64_t> *StmtCountMap;
>> std::vector<uint64_t> *RegionCounts;
>> @@ -66,8 +67,9 @@
>>
>> public:
>> CodeGenPGO(CodeGenModule &CGM)
>> - : CGM(CGM), NumRegionCounters(0), RegionCounters(0),
>> RegionCounterMap(0),
>> - StmtCountMap(0), RegionCounts(0), CurrentRegionCount(0) {}
>> + : CGM(CGM), NumRegionCounters(0), RegionCounters(0), RegionPCs(0),
>> + RegionCounterMap(0), StmtCountMap(0), RegionCounts(0),
>> + CurrentRegionCount(0) {}
>> ~CodeGenPGO() {}
>>
>> /// Whether or not we have PGO region data for the current function.
>> This is
>>
>>
>>
>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140222/68bb07f8/attachment.html>
More information about the llvm-dev
mailing list