[llvm-commits] [llvm] r160284 - in /llvm/trunk: lib/Transforms/Instrumentation/AddressSanitizer.cpp test/Instrumentation/AddressSanitizer/basic.ll

Kostya Serebryany kcc at google.com
Mon Jul 16 09:55:31 PDT 2012


On Mon, Jul 16, 2012 at 8:37 PM, Chandler Carruth <chandlerc at google.com>wrote:

> On Mon, Jul 16, 2012 at 9:15 AM, Kostya Serebryany <kcc at google.com> wrote:
>
>> Author: kcc
>> Date: Mon Jul 16 11:15:40 2012
>> New Revision: 160284
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=160284&view=rev
>> Log:
>> [asan] refactor instrumentation to allow merging the crash callbacks (not
>> fully implemented yet, no functionality change except the BB order)
>>
>
> Uh... BB order is incredibly sensitive and significant, as we have
> learned... =/ I'm beginning to wish we had kept my patch in now and built
> upon it, as I think its simplifications would have been nice here.
>
That was not a choice unfortunately, as red bots will freeze our other work
:(



>
> Should I go ahead and provide a patch that introduces probabilities so at
> least performance remains stable as the basic block order shifts?
>

mmm. I am in the middle of more refactoring...
I can add the probabilities myself (based on your patch), likely tomorrow.


>
>
>> Modified:
>>     llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>     llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>
>> Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=160284&r1=160283&r2=160284&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>> (original)
>> +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Mon
>> Jul 16 11:15:40 2012
>> @@ -72,6 +72,9 @@
>>  static const int kAsanStackRightRedzoneMagic = 0xf3;
>>  static const int kAsanStackPartialRedzoneMagic = 0xf4;
>>
>> +// Accesses sizes are powers of two: 1, 2, 4, 8, 16.
>> +static const size_t kNumberOfAccessSizes = 5;
>> +
>>  // Command-line flags.
>>
>>  // This flag may need to be replaced with -f[no-]asan-reads.
>> @@ -82,7 +85,10 @@
>>  static cl::opt<bool> ClInstrumentAtomics("asan-instrument-atomics",
>>         cl::desc("instrument atomic instructions (rmw, cmpxchg)"),
>>         cl::Hidden, cl::init(true));
>> -// This flags limits the number of instructions to be instrumented
>> +static cl::opt<bool> ClMergeCallbacks("asan-merge-callbacks",
>> +       cl::desc("merge __asan_report_ callbacks to create fewer BBs"),
>> +       cl::Hidden, cl::init(false));
>> +// This flag limits the number of instructions to be instrumented
>>  // in any given BB. Normally, this should be set to unlimited (INT_MAX),
>>  // but due to http://llvm.org/bugs/show_bug.cgi?id=12652 we temporary
>>  // set it to 10000.
>> @@ -138,18 +144,33 @@
>>
>>  namespace {
>>
>> +/// An object of this type is created while instrumenting every function.
>> +struct AsanFunctionContext {
>> +  AsanFunctionContext() {
>> +    memset(this, 0, sizeof(*this));
>>
>
> This seems like a hack... Is there some reason it was necessary? That is,
> why not:
>
> AsanFunctionContext() : CrashBlock() {}
>

Hm. Yea, should work. Will change in the next patch.


>
> Which should do the same thing?
>
>
>> +  }
>> +
>> +  // These are initially zero. If we require at least one call to
>> +  // __asan_report_{read,write}{1,2,4,8,16}, an appropriate BB is
>> created.
>> +  BasicBlock *CrashBlock[2][kNumberOfAccessSizes];
>> +};
>> +
>>  /// AddressSanitizer: instrument the code in module to find memory bugs.
>>  struct AddressSanitizer : public ModulePass {
>>    AddressSanitizer();
>>    virtual const char *getPassName() const;
>> -  void instrumentMop(Instruction *I);
>> -  void instrumentAddress(Instruction *OrigIns, IRBuilder<> &IRB,
>> +  void instrumentMop(AsanFunctionContext &AFC, Instruction *I);
>> +  void instrumentAddress(AsanFunctionContext &AFC,
>> +                         Instruction *OrigIns, IRBuilder<> &IRB,
>>                           Value *Addr, uint32_t TypeSize, bool IsWrite);
>> -  Instruction *generateCrashCode(IRBuilder<> &IRB, Value *Addr,
>> +  Value *createSlowPathCmp(IRBuilder<> &IRB, Value *AddrLong,
>> +                           Value *ShadowValue, uint32_t TypeSize);
>> +  Instruction *generateCrashCode(BasicBlock *BB, Value *Addr,
>>                                   bool IsWrite, uint32_t TypeSize);
>> -  bool instrumentMemIntrinsic(MemIntrinsic *MI);
>> -  void instrumentMemIntrinsicParam(Instruction *OrigIns, Value *Addr,
>> -                                  Value *Size,
>> +  bool instrumentMemIntrinsic(AsanFunctionContext &AFC, MemIntrinsic
>> *MI);
>> +  void instrumentMemIntrinsicParam(AsanFunctionContext &AFC,
>> +                                   Instruction *OrigIns, Value *Addr,
>> +                                   Value *Size,
>>
>
> All of this churn makes me think that this pass should really be a
> function pass rather than a module pass, potentially depending on a module
> pass running first. Would such a split make sense?
>

Yes, definitely. That's what we already have in tsan.


> Doesn't have to happen now, but I think it would make the organization of
> the instrumentation passes much more clear.
>
> Eventually, it will also have a performance impact -- there is some hope
> that we will be able to parallelize function passes. Anyways, that's a
> long-term thing.
>
>
>>                                     Instruction *InsertBefore, bool
>> IsWrite);
>>    Value *memToShadow(Value *Shadow, IRBuilder<> &IRB);
>>    bool handleFunction(Module &M, Function &F);
>> @@ -192,11 +213,10 @@
>>    Function *AsanInitFunction;
>>    Instruction *CtorInsertBefore;
>>    OwningPtr<FunctionBlackList> BL;
>> -  // Accesses sizes are powers of two: 1, 2, 4, 8, 16.
>> -  static const size_t kNumberOfAccessSizes = 5;
>>    // This array is indexed by AccessIsWrite and log2(AccessSize).
>>    Function *AsanErrorCallback[2][kNumberOfAccessSizes];
>>  };
>> +
>>  }  // namespace
>>
>>  char AddressSanitizer::ID = 0;
>> @@ -230,19 +250,24 @@
>>  //     ThenBlock
>>  //   Tail
>>  //
>> -// Returns the ThenBlock's terminator.
>> -static BranchInst *splitBlockAndInsertIfThen(Value *Cmp) {
>> +// If ThenBlock is zero, a new block is created and its terminator is
>> returned.
>> +// Otherwize NULL is returned.
>> +static BranchInst *splitBlockAndInsertIfThen(Value *Cmp,
>> +                                             BasicBlock *ThenBlock = 0) {
>>    Instruction *SplitBefore = cast<Instruction>(Cmp)->getNextNode();
>>    BasicBlock *Head = SplitBefore->getParent();
>>    BasicBlock *Tail = Head->splitBasicBlock(SplitBefore);
>>    TerminatorInst *HeadOldTerm = Head->getTerminator();
>> -  LLVMContext &C = Head->getParent()->getParent()->getContext();
>> -  BasicBlock *ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>> +  BranchInst *CheckTerm = NULL;
>>
>
> LLVM style is to use '0' rather than NULL....
>

Hm? I can't see such a rule in http://llvm.org/docs/CodingStandards.html (not
that I care much)


>
>
>> +  if (!ThenBlock) {
>> +    LLVMContext &C = Head->getParent()->getParent()->getContext();
>>
>
> This should really just be a method, it would simplify stuff such as
> getting the context.
>

It was a method, and then I've made it a function and wanted to use it in
msan. Oh my...


>
>
>> +    ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>> +    CheckTerm = BranchInst::Create(Tail, ThenBlock);
>> +  }
>>    BranchInst *HeadNewTerm =
>>      BranchInst::Create(/*ifTrue*/ThenBlock, /*ifFalse*/Tail, Cmp);
>>    ReplaceInstWithInst(HeadOldTerm, HeadNewTerm);
>>
>> -  BranchInst *CheckTerm = BranchInst::Create(Tail, ThenBlock);
>>    return CheckTerm;
>>  }
>>
>> @@ -256,12 +281,13 @@
>>                                                 MappingOffset));
>>  }
>>
>> -void AddressSanitizer::instrumentMemIntrinsicParam(Instruction *OrigIns,
>> +void AddressSanitizer::instrumentMemIntrinsicParam(
>> +    AsanFunctionContext &AFC, Instruction *OrigIns,
>>      Value *Addr, Value *Size, Instruction *InsertBefore, bool IsWrite) {
>>    // Check the first byte.
>>    {
>>      IRBuilder<> IRB(InsertBefore);
>> -    instrumentAddress(OrigIns, IRB, Addr, 8, IsWrite);
>> +    instrumentAddress(AFC, OrigIns, IRB, Addr, 8, IsWrite);
>>    }
>>    // Check the last byte.
>>    {
>> @@ -271,12 +297,13 @@
>>      SizeMinusOne = IRB.CreateIntCast(SizeMinusOne, IntptrTy, false);
>>      Value *AddrLong = IRB.CreatePointerCast(Addr, IntptrTy);
>>      Value *AddrPlusSizeMinisOne = IRB.CreateAdd(AddrLong, SizeMinusOne);
>> -    instrumentAddress(OrigIns, IRB, AddrPlusSizeMinisOne, 8, IsWrite);
>> +    instrumentAddress(AFC, OrigIns, IRB, AddrPlusSizeMinisOne, 8,
>> IsWrite);
>>    }
>>  }
>>
>>  // Instrument memset/memmove/memcpy
>> -bool AddressSanitizer::instrumentMemIntrinsic(MemIntrinsic *MI) {
>> +bool AddressSanitizer::instrumentMemIntrinsic(AsanFunctionContext &AFC,
>> +                                              MemIntrinsic *MI) {
>>    Value *Dst = MI->getDest();
>>    MemTransferInst *MemTran = dyn_cast<MemTransferInst>(MI);
>>    Value *Src = MemTran ? MemTran->getSource() : NULL;
>> @@ -295,9 +322,9 @@
>>      InsertBefore = splitBlockAndInsertIfThen(Cmp);
>>    }
>>
>> -  instrumentMemIntrinsicParam(MI, Dst, Length, InsertBefore, true);
>> +  instrumentMemIntrinsicParam(AFC, MI, Dst, Length, InsertBefore, true);
>>    if (Src)
>> -    instrumentMemIntrinsicParam(MI, Src, Length, InsertBefore, false);
>> +    instrumentMemIntrinsicParam(AFC, MI, Src, Length, InsertBefore,
>> false);
>>    return true;
>>  }
>>
>> @@ -327,7 +354,7 @@
>>    return NULL;
>>  }
>>
>> -void AddressSanitizer::instrumentMop(Instruction *I) {
>> +void AddressSanitizer::instrumentMop(AsanFunctionContext &AFC,
>> Instruction *I) {
>>    bool IsWrite;
>>    Value *Addr = isInterestingMemoryAccess(I, &IsWrite);
>>    assert(Addr);
>> @@ -348,7 +375,7 @@
>>    }
>>
>>    IRBuilder<> IRB(I);
>> -  instrumentAddress(I, IRB, Addr, TypeSize, IsWrite);
>> +  instrumentAddress(AFC, I, IRB, Addr, TypeSize, IsWrite);
>>  }
>>
>>  // Validate the result of Module::getOrInsertFunction called for an
>> interface
>> @@ -363,7 +390,8 @@
>>  }
>>
>>  Instruction *AddressSanitizer::generateCrashCode(
>> -    IRBuilder<> &IRB, Value *Addr, bool IsWrite, uint32_t TypeSize) {
>> +    BasicBlock *BB, Value *Addr, bool IsWrite, uint32_t TypeSize) {
>> +  IRBuilder<> IRB(BB->getFirstNonPHI());
>>    size_t AccessSizeIndex = CountTrailingZeros_32(TypeSize / 8);
>>    assert(AccessSizeIndex < kNumberOfAccessSizes);
>>    CallInst *Call =
>> IRB.CreateCall(AsanErrorCallback[IsWrite][AccessSizeIndex],
>> @@ -372,7 +400,26 @@
>>    return Call;
>>  }
>>
>> -void AddressSanitizer::instrumentAddress(Instruction *OrigIns,
>> +Value * AddressSanitizer::createSlowPathCmp(IRBuilder<> &IRB, Value
>> *AddrLong,
>>
>
> No ' ' on both sides of the '*' here: Value *AddressSanitizer::...
>
Thanks, will fix with next patch.


>
>> +                                            Value *ShadowValue,
>> +                                            uint32_t TypeSize) {
>> +  size_t Granularity = 1 << MappingScale;
>> +  // Addr & (Granularity - 1)
>> +  Value *LastAccessedByte = IRB.CreateAnd(
>> +      AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>> +  // (Addr & (Granularity - 1)) + size - 1
>> +  if (TypeSize / 8 > 1)
>> +    LastAccessedByte = IRB.CreateAdd(
>> +        LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 - 1));
>> +  // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>> +  LastAccessedByte = IRB.CreateIntCast(
>> +      LastAccessedByte, IRB.getInt8Ty(), false);
>> +  // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>> +  return IRB.CreateICmpSGE(LastAccessedByte, ShadowValue);
>> +}
>> +
>> +void AddressSanitizer::instrumentAddress(AsanFunctionContext &AFC,
>> +                                         Instruction *OrigIns,
>>                                           IRBuilder<> &IRB, Value *Addr,
>>                                           uint32_t TypeSize, bool
>> IsWrite) {
>>    Value *AddrLong = IRB.CreatePointerCast(Addr, IntptrTy);
>> @@ -387,31 +434,27 @@
>>
>>    Value *Cmp = IRB.CreateICmpNE(ShadowValue, CmpVal);
>>
>> -  Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>> -  IRBuilder<> IRB2(CheckTerm);
>> +  BasicBlock *CrashBlock = NULL;
>> +  if (ClMergeCallbacks) {
>> +    llvm_unreachable("unimplemented yet");
>> +  } else {
>> +    CrashBlock = BasicBlock::Create(*C, "crash_bb",
>> +                                    OrigIns->getParent()->getParent());
>> +    new UnreachableInst(*C, CrashBlock);
>> +    Instruction *Crash =
>> +        generateCrashCode(CrashBlock, AddrLong, IsWrite, TypeSize);
>> +    Crash->setDebugLoc(OrigIns->getDebugLoc());
>> +  }
>>
>>    size_t Granularity = 1 << MappingScale;
>>    if (TypeSize < 8 * Granularity) {
>> -    // Addr & (Granularity - 1)
>> -    Value *LastAccessedByte = IRB2.CreateAnd(
>> -        AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>> -    // (Addr & (Granularity - 1)) + size - 1
>> -    if (TypeSize / 8 > 1)
>> -      LastAccessedByte = IRB2.CreateAdd(
>> -          LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 -
>> 1));
>> -    // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>> -    LastAccessedByte = IRB2.CreateIntCast(
>> -        LastAccessedByte, IRB.getInt8Ty(), false);
>> -    // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>> -    Value *Cmp2 = IRB2.CreateICmpSGE(LastAccessedByte, ShadowValue);
>> -
>> -    CheckTerm = splitBlockAndInsertIfThen(Cmp2);
>> -  }
>> -
>> -  IRBuilder<> IRB1(CheckTerm);
>> -  Instruction *Crash = generateCrashCode(IRB1, AddrLong, IsWrite,
>> TypeSize);
>> -  Crash->setDebugLoc(OrigIns->getDebugLoc());
>> -  ReplaceInstWithInst(CheckTerm, new UnreachableInst(*C));
>> +    Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>> +    IRB.SetInsertPoint(CheckTerm);
>> +    Value *Cmp2 = createSlowPathCmp(IRB, AddrLong, ShadowValue,
>> TypeSize);
>> +    splitBlockAndInsertIfThen(Cmp2, CrashBlock);
>> +  } else {
>> +    splitBlockAndInsertIfThen(Cmp, CrashBlock);
>> +  }
>>  }
>>
>>  // This function replaces all global variables with new variables that
>> have
>> @@ -733,6 +776,8 @@
>>      }
>>    }
>>
>> +  AsanFunctionContext AFC;
>> +
>>    // Instrument.
>>    int NumInstrumented = 0;
>>    for (size_t i = 0, n = ToInstrument.size(); i != n; i++) {
>> @@ -740,9 +785,9 @@
>>      if (ClDebugMin < 0 || ClDebugMax < 0 ||
>>          (NumInstrumented >= ClDebugMin && NumInstrumented <=
>> ClDebugMax)) {
>>        if (isInterestingMemoryAccess(Inst, &IsWrite))
>> -        instrumentMop(Inst);
>> +        instrumentMop(AFC, Inst);
>>        else
>> -        instrumentMemIntrinsic(cast<MemIntrinsic>(Inst));
>> +        instrumentMemIntrinsic(AFC, cast<MemIntrinsic>(Inst));
>>      }
>>      NumInstrumented++;
>>    }
>>
>> Modified: llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll?rev=160284&r1=160283&r2=160284&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll (original)
>> +++ llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll Mon Jul 16
>> 11:15:40 2012
>> @@ -20,6 +20,10 @@
>>  ; to the end of the function.
>>  ; CHECK:   %tmp1 = load i32* %a
>>  ; CHECK:   ret i32 %tmp1
>> +
>> +; The crash block reports the error.
>> +; CHECK:   call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>> +; CHECK:   unreachable
>>  ;
>>  ; First instrumentation block refines the shadow test.
>>  ; CHECK:   and i64 %[[LOAD_ADDR]], 7
>> @@ -28,9 +32,6 @@
>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[LOAD_SHADOW]]
>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>  ;
>> -; Final instrumentation block reports the error.
>> -; CHECK:   call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>> -; CHECK:   unreachable
>>
>>  entry:
>>    %tmp1 = load i32* %a
>> @@ -53,6 +54,10 @@
>>  ; CHECK:   store i32 42, i32* %a
>>  ; CHECK:   ret void
>>  ;
>> +; The crash block reports the error.
>> +; CHECK:   call void @__asan_report_store4(i64 %[[STORE_ADDR]]) noreturn
>> +; CHECK:   unreachable
>> +;
>>  ; First instrumentation block refines the shadow test.
>>  ; CHECK:   and i64 %[[STORE_ADDR]], 7
>>  ; CHECK:   add i64 %{{.*}}, 3
>> @@ -60,9 +65,6 @@
>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[STORE_SHADOW]]
>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>  ;
>> -; Final instrumentation block reports the error.
>> -; CHECK:   call void @__asan_report_store4(i64 %[[STORE_ADDR]]) noreturn
>> -; CHECK:   unreachable
>>
>>  entry:
>>    store i32 42, i32* %a
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120716/8507c893/attachment.html>


More information about the llvm-commits mailing list