[llvm-commits] [llvm] r160284 - in /llvm/trunk: lib/Transforms/Instrumentation/AddressSanitizer.cpp test/Instrumentation/AddressSanitizer/basic.ll

Chandler Carruth chandlerc at google.com
Mon Jul 16 10:06:45 PDT 2012


On Mon, Jul 16, 2012 at 9:55 AM, Kostya Serebryany <kcc at google.com> wrote:

>
>
> On Mon, Jul 16, 2012 at 8:37 PM, Chandler Carruth <chandlerc at google.com>wrote:
>
>> On Mon, Jul 16, 2012 at 9:15 AM, Kostya Serebryany <kcc at google.com>wrote:
>>
>>> Author: kcc
>>> Date: Mon Jul 16 11:15:40 2012
>>> New Revision: 160284
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=160284&view=rev
>>> Log:
>>> [asan] refactor instrumentation to allow merging the crash callbacks
>>> (not fully implemented yet, no functionality change except the BB order)
>>>
>>
>> Uh... BB order is incredibly sensitive and significant, as we have
>> learned... =/ I'm beginning to wish we had kept my patch in now and built
>> upon it, as I think its simplifications would have been nice here.
>>
> That was not a choice unfortunately, as red bots will freeze our other
> work :(
>
>
>
>>
>> Should I go ahead and provide a patch that introduces probabilities so at
>> least performance remains stable as the basic block order shifts?
>>
>
> mmm. I am in the middle of more refactoring...
> I can add the probabilities myself (based on your patch), likely tomorrow.
>
>
>>
>>
>>> Modified:
>>>     llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>>     llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>>
>>> Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=160284&r1=160283&r2=160284&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>> (original)
>>> +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Mon
>>> Jul 16 11:15:40 2012
>>> @@ -72,6 +72,9 @@
>>>  static const int kAsanStackRightRedzoneMagic = 0xf3;
>>>  static const int kAsanStackPartialRedzoneMagic = 0xf4;
>>>
>>> +// Accesses sizes are powers of two: 1, 2, 4, 8, 16.
>>> +static const size_t kNumberOfAccessSizes = 5;
>>> +
>>>  // Command-line flags.
>>>
>>>  // This flag may need to be replaced with -f[no-]asan-reads.
>>> @@ -82,7 +85,10 @@
>>>  static cl::opt<bool> ClInstrumentAtomics("asan-instrument-atomics",
>>>         cl::desc("instrument atomic instructions (rmw, cmpxchg)"),
>>>         cl::Hidden, cl::init(true));
>>> -// This flags limits the number of instructions to be instrumented
>>> +static cl::opt<bool> ClMergeCallbacks("asan-merge-callbacks",
>>> +       cl::desc("merge __asan_report_ callbacks to create fewer BBs"),
>>> +       cl::Hidden, cl::init(false));
>>> +// This flag limits the number of instructions to be instrumented
>>>  // in any given BB. Normally, this should be set to unlimited (INT_MAX),
>>>  // but due to http://llvm.org/bugs/show_bug.cgi?id=12652 we temporary
>>>  // set it to 10000.
>>> @@ -138,18 +144,33 @@
>>>
>>>  namespace {
>>>
>>> +/// An object of this type is created while instrumenting every
>>> function.
>>> +struct AsanFunctionContext {
>>> +  AsanFunctionContext() {
>>> +    memset(this, 0, sizeof(*this));
>>>
>>
>> This seems like a hack... Is there some reason it was necessary? That is,
>> why not:
>>
>> AsanFunctionContext() : CrashBlock() {}
>>
>
> Hm. Yea, should work. Will change in the next patch.
>
>
>>
>> Which should do the same thing?
>>
>>
>>> +  }
>>> +
>>> +  // These are initially zero. If we require at least one call to
>>> +  // __asan_report_{read,write}{1,2,4,8,16}, an appropriate BB is
>>> created.
>>> +  BasicBlock *CrashBlock[2][kNumberOfAccessSizes];
>>> +};
>>> +
>>>  /// AddressSanitizer: instrument the code in module to find memory bugs.
>>>  struct AddressSanitizer : public ModulePass {
>>>    AddressSanitizer();
>>>    virtual const char *getPassName() const;
>>> -  void instrumentMop(Instruction *I);
>>> -  void instrumentAddress(Instruction *OrigIns, IRBuilder<> &IRB,
>>> +  void instrumentMop(AsanFunctionContext &AFC, Instruction *I);
>>> +  void instrumentAddress(AsanFunctionContext &AFC,
>>> +                         Instruction *OrigIns, IRBuilder<> &IRB,
>>>                           Value *Addr, uint32_t TypeSize, bool IsWrite);
>>> -  Instruction *generateCrashCode(IRBuilder<> &IRB, Value *Addr,
>>> +  Value *createSlowPathCmp(IRBuilder<> &IRB, Value *AddrLong,
>>> +                           Value *ShadowValue, uint32_t TypeSize);
>>> +  Instruction *generateCrashCode(BasicBlock *BB, Value *Addr,
>>>                                   bool IsWrite, uint32_t TypeSize);
>>> -  bool instrumentMemIntrinsic(MemIntrinsic *MI);
>>> -  void instrumentMemIntrinsicParam(Instruction *OrigIns, Value *Addr,
>>> -                                  Value *Size,
>>> +  bool instrumentMemIntrinsic(AsanFunctionContext &AFC, MemIntrinsic
>>> *MI);
>>> +  void instrumentMemIntrinsicParam(AsanFunctionContext &AFC,
>>> +                                   Instruction *OrigIns, Value *Addr,
>>> +                                   Value *Size,
>>>
>>
>> All of this churn makes me think that this pass should really be a
>> function pass rather than a module pass, potentially depending on a module
>> pass running first. Would such a split make sense?
>>
>
> Yes, definitely. That's what we already have in tsan.
>
>
>> Doesn't have to happen now, but I think it would make the organization of
>> the instrumentation passes much more clear.
>>
>> Eventually, it will also have a performance impact -- there is some hope
>> that we will be able to parallelize function passes. Anyways, that's a
>> long-term thing.
>>
>>
>>>                                     Instruction *InsertBefore, bool
>>> IsWrite);
>>>    Value *memToShadow(Value *Shadow, IRBuilder<> &IRB);
>>>    bool handleFunction(Module &M, Function &F);
>>> @@ -192,11 +213,10 @@
>>>    Function *AsanInitFunction;
>>>    Instruction *CtorInsertBefore;
>>>    OwningPtr<FunctionBlackList> BL;
>>> -  // Accesses sizes are powers of two: 1, 2, 4, 8, 16.
>>> -  static const size_t kNumberOfAccessSizes = 5;
>>>    // This array is indexed by AccessIsWrite and log2(AccessSize).
>>>    Function *AsanErrorCallback[2][kNumberOfAccessSizes];
>>>  };
>>> +
>>>  }  // namespace
>>>
>>>  char AddressSanitizer::ID = 0;
>>> @@ -230,19 +250,24 @@
>>>  //     ThenBlock
>>>  //   Tail
>>>  //
>>> -// Returns the ThenBlock's terminator.
>>> -static BranchInst *splitBlockAndInsertIfThen(Value *Cmp) {
>>> +// If ThenBlock is zero, a new block is created and its terminator is
>>> returned.
>>> +// Otherwize NULL is returned.
>>> +static BranchInst *splitBlockAndInsertIfThen(Value *Cmp,
>>> +                                             BasicBlock *ThenBlock = 0)
>>> {
>>>    Instruction *SplitBefore = cast<Instruction>(Cmp)->getNextNode();
>>>    BasicBlock *Head = SplitBefore->getParent();
>>>    BasicBlock *Tail = Head->splitBasicBlock(SplitBefore);
>>>    TerminatorInst *HeadOldTerm = Head->getTerminator();
>>> -  LLVMContext &C = Head->getParent()->getParent()->getContext();
>>> -  BasicBlock *ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>>> +  BranchInst *CheckTerm = NULL;
>>>
>>
>> LLVM style is to use '0' rather than NULL....
>>
>
> Hm? I can't see such a rule in http://llvm.org/docs/CodingStandards.html (not
> that I care much)
>

Lots of the little stuff isn't there... not that I care much either... ;]


>
>
>>
>>
>>> +  if (!ThenBlock) {
>>> +    LLVMContext &C = Head->getParent()->getParent()->getContext();
>>>
>>
>> This should really just be a method, it would simplify stuff such as
>> getting the context.
>>
>
> It was a method, and then I've made it a function and wanted to use it in
> msan. Oh my...
>

I think we're getting close to needing a CFGBuilder which encapsulates a
lot of convenience logic for building of these structures...


>
>
>>
>>
>>> +    ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>>> +    CheckTerm = BranchInst::Create(Tail, ThenBlock);
>>> +  }
>>>    BranchInst *HeadNewTerm =
>>>      BranchInst::Create(/*ifTrue*/ThenBlock, /*ifFalse*/Tail, Cmp);
>>>    ReplaceInstWithInst(HeadOldTerm, HeadNewTerm);
>>>
>>> -  BranchInst *CheckTerm = BranchInst::Create(Tail, ThenBlock);
>>>    return CheckTerm;
>>>  }
>>>
>>> @@ -256,12 +281,13 @@
>>>                                                 MappingOffset));
>>>  }
>>>
>>> -void AddressSanitizer::instrumentMemIntrinsicParam(Instruction *OrigIns,
>>> +void AddressSanitizer::instrumentMemIntrinsicParam(
>>> +    AsanFunctionContext &AFC, Instruction *OrigIns,
>>>      Value *Addr, Value *Size, Instruction *InsertBefore, bool IsWrite) {
>>>    // Check the first byte.
>>>    {
>>>      IRBuilder<> IRB(InsertBefore);
>>> -    instrumentAddress(OrigIns, IRB, Addr, 8, IsWrite);
>>> +    instrumentAddress(AFC, OrigIns, IRB, Addr, 8, IsWrite);
>>>    }
>>>    // Check the last byte.
>>>    {
>>> @@ -271,12 +297,13 @@
>>>      SizeMinusOne = IRB.CreateIntCast(SizeMinusOne, IntptrTy, false);
>>>      Value *AddrLong = IRB.CreatePointerCast(Addr, IntptrTy);
>>>      Value *AddrPlusSizeMinisOne = IRB.CreateAdd(AddrLong, SizeMinusOne);
>>> -    instrumentAddress(OrigIns, IRB, AddrPlusSizeMinisOne, 8, IsWrite);
>>> +    instrumentAddress(AFC, OrigIns, IRB, AddrPlusSizeMinisOne, 8,
>>> IsWrite);
>>>    }
>>>  }
>>>
>>>  // Instrument memset/memmove/memcpy
>>> -bool AddressSanitizer::instrumentMemIntrinsic(MemIntrinsic *MI) {
>>> +bool AddressSanitizer::instrumentMemIntrinsic(AsanFunctionContext &AFC,
>>> +                                              MemIntrinsic *MI) {
>>>    Value *Dst = MI->getDest();
>>>    MemTransferInst *MemTran = dyn_cast<MemTransferInst>(MI);
>>>    Value *Src = MemTran ? MemTran->getSource() : NULL;
>>> @@ -295,9 +322,9 @@
>>>      InsertBefore = splitBlockAndInsertIfThen(Cmp);
>>>    }
>>>
>>> -  instrumentMemIntrinsicParam(MI, Dst, Length, InsertBefore, true);
>>> +  instrumentMemIntrinsicParam(AFC, MI, Dst, Length, InsertBefore, true);
>>>    if (Src)
>>> -    instrumentMemIntrinsicParam(MI, Src, Length, InsertBefore, false);
>>> +    instrumentMemIntrinsicParam(AFC, MI, Src, Length, InsertBefore,
>>> false);
>>>    return true;
>>>  }
>>>
>>> @@ -327,7 +354,7 @@
>>>    return NULL;
>>>  }
>>>
>>> -void AddressSanitizer::instrumentMop(Instruction *I) {
>>> +void AddressSanitizer::instrumentMop(AsanFunctionContext &AFC,
>>> Instruction *I) {
>>>    bool IsWrite;
>>>    Value *Addr = isInterestingMemoryAccess(I, &IsWrite);
>>>    assert(Addr);
>>> @@ -348,7 +375,7 @@
>>>    }
>>>
>>>    IRBuilder<> IRB(I);
>>> -  instrumentAddress(I, IRB, Addr, TypeSize, IsWrite);
>>> +  instrumentAddress(AFC, I, IRB, Addr, TypeSize, IsWrite);
>>>  }
>>>
>>>  // Validate the result of Module::getOrInsertFunction called for an
>>> interface
>>> @@ -363,7 +390,8 @@
>>>  }
>>>
>>>  Instruction *AddressSanitizer::generateCrashCode(
>>> -    IRBuilder<> &IRB, Value *Addr, bool IsWrite, uint32_t TypeSize) {
>>> +    BasicBlock *BB, Value *Addr, bool IsWrite, uint32_t TypeSize) {
>>> +  IRBuilder<> IRB(BB->getFirstNonPHI());
>>>    size_t AccessSizeIndex = CountTrailingZeros_32(TypeSize / 8);
>>>    assert(AccessSizeIndex < kNumberOfAccessSizes);
>>>    CallInst *Call =
>>> IRB.CreateCall(AsanErrorCallback[IsWrite][AccessSizeIndex],
>>> @@ -372,7 +400,26 @@
>>>    return Call;
>>>  }
>>>
>>> -void AddressSanitizer::instrumentAddress(Instruction *OrigIns,
>>> +Value * AddressSanitizer::createSlowPathCmp(IRBuilder<> &IRB, Value
>>> *AddrLong,
>>>
>>
>> No ' ' on both sides of the '*' here: Value *AddressSanitizer::...
>>
> Thanks, will fix with next patch.
>
>
>>
>>> +                                            Value *ShadowValue,
>>> +                                            uint32_t TypeSize) {
>>> +  size_t Granularity = 1 << MappingScale;
>>> +  // Addr & (Granularity - 1)
>>> +  Value *LastAccessedByte = IRB.CreateAnd(
>>> +      AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>>> +  // (Addr & (Granularity - 1)) + size - 1
>>> +  if (TypeSize / 8 > 1)
>>> +    LastAccessedByte = IRB.CreateAdd(
>>> +        LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 - 1));
>>> +  // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>>> +  LastAccessedByte = IRB.CreateIntCast(
>>> +      LastAccessedByte, IRB.getInt8Ty(), false);
>>> +  // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>>> +  return IRB.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>> +}
>>> +
>>> +void AddressSanitizer::instrumentAddress(AsanFunctionContext &AFC,
>>> +                                         Instruction *OrigIns,
>>>                                           IRBuilder<> &IRB, Value *Addr,
>>>                                           uint32_t TypeSize, bool
>>> IsWrite) {
>>>    Value *AddrLong = IRB.CreatePointerCast(Addr, IntptrTy);
>>> @@ -387,31 +434,27 @@
>>>
>>>    Value *Cmp = IRB.CreateICmpNE(ShadowValue, CmpVal);
>>>
>>> -  Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>>> -  IRBuilder<> IRB2(CheckTerm);
>>> +  BasicBlock *CrashBlock = NULL;
>>> +  if (ClMergeCallbacks) {
>>> +    llvm_unreachable("unimplemented yet");
>>> +  } else {
>>> +    CrashBlock = BasicBlock::Create(*C, "crash_bb",
>>> +                                    OrigIns->getParent()->getParent());
>>> +    new UnreachableInst(*C, CrashBlock);
>>> +    Instruction *Crash =
>>> +        generateCrashCode(CrashBlock, AddrLong, IsWrite, TypeSize);
>>> +    Crash->setDebugLoc(OrigIns->getDebugLoc());
>>> +  }
>>>
>>>    size_t Granularity = 1 << MappingScale;
>>>    if (TypeSize < 8 * Granularity) {
>>> -    // Addr & (Granularity - 1)
>>> -    Value *LastAccessedByte = IRB2.CreateAnd(
>>> -        AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>>> -    // (Addr & (Granularity - 1)) + size - 1
>>> -    if (TypeSize / 8 > 1)
>>> -      LastAccessedByte = IRB2.CreateAdd(
>>> -          LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 -
>>> 1));
>>> -    // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>>> -    LastAccessedByte = IRB2.CreateIntCast(
>>> -        LastAccessedByte, IRB.getInt8Ty(), false);
>>> -    // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>>> -    Value *Cmp2 = IRB2.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>> -
>>> -    CheckTerm = splitBlockAndInsertIfThen(Cmp2);
>>> -  }
>>> -
>>> -  IRBuilder<> IRB1(CheckTerm);
>>> -  Instruction *Crash = generateCrashCode(IRB1, AddrLong, IsWrite,
>>> TypeSize);
>>> -  Crash->setDebugLoc(OrigIns->getDebugLoc());
>>> -  ReplaceInstWithInst(CheckTerm, new UnreachableInst(*C));
>>> +    Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>>> +    IRB.SetInsertPoint(CheckTerm);
>>> +    Value *Cmp2 = createSlowPathCmp(IRB, AddrLong, ShadowValue,
>>> TypeSize);
>>> +    splitBlockAndInsertIfThen(Cmp2, CrashBlock);
>>> +  } else {
>>> +    splitBlockAndInsertIfThen(Cmp, CrashBlock);
>>> +  }
>>>  }
>>>
>>>  // This function replaces all global variables with new variables that
>>> have
>>> @@ -733,6 +776,8 @@
>>>      }
>>>    }
>>>
>>> +  AsanFunctionContext AFC;
>>> +
>>>    // Instrument.
>>>    int NumInstrumented = 0;
>>>    for (size_t i = 0, n = ToInstrument.size(); i != n; i++) {
>>> @@ -740,9 +785,9 @@
>>>      if (ClDebugMin < 0 || ClDebugMax < 0 ||
>>>          (NumInstrumented >= ClDebugMin && NumInstrumented <=
>>> ClDebugMax)) {
>>>        if (isInterestingMemoryAccess(Inst, &IsWrite))
>>> -        instrumentMop(Inst);
>>> +        instrumentMop(AFC, Inst);
>>>        else
>>> -        instrumentMemIntrinsic(cast<MemIntrinsic>(Inst));
>>> +        instrumentMemIntrinsic(AFC, cast<MemIntrinsic>(Inst));
>>>      }
>>>      NumInstrumented++;
>>>    }
>>>
>>> Modified: llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll?rev=160284&r1=160283&r2=160284&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll (original)
>>> +++ llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll Mon Jul 16
>>> 11:15:40 2012
>>> @@ -20,6 +20,10 @@
>>>  ; to the end of the function.
>>>  ; CHECK:   %tmp1 = load i32* %a
>>>  ; CHECK:   ret i32 %tmp1
>>> +
>>> +; The crash block reports the error.
>>> +; CHECK:   call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>>> +; CHECK:   unreachable
>>>  ;
>>>  ; First instrumentation block refines the shadow test.
>>>  ; CHECK:   and i64 %[[LOAD_ADDR]], 7
>>> @@ -28,9 +32,6 @@
>>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[LOAD_SHADOW]]
>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>  ;
>>> -; Final instrumentation block reports the error.
>>> -; CHECK:   call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>>> -; CHECK:   unreachable
>>>
>>>  entry:
>>>    %tmp1 = load i32* %a
>>> @@ -53,6 +54,10 @@
>>>  ; CHECK:   store i32 42, i32* %a
>>>  ; CHECK:   ret void
>>>  ;
>>> +; The crash block reports the error.
>>> +; CHECK:   call void @__asan_report_store4(i64 %[[STORE_ADDR]]) noreturn
>>> +; CHECK:   unreachable
>>> +;
>>>  ; First instrumentation block refines the shadow test.
>>>  ; CHECK:   and i64 %[[STORE_ADDR]], 7
>>>  ; CHECK:   add i64 %{{.*}}, 3
>>> @@ -60,9 +65,6 @@
>>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[STORE_SHADOW]]
>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>  ;
>>> -; Final instrumentation block reports the error.
>>> -; CHECK:   call void @__asan_report_store4(i64 %[[STORE_ADDR]]) noreturn
>>> -; CHECK:   unreachable
>>>
>>>  entry:
>>>    store i32 42, i32* %a
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120716/e9e58915/attachment.html>


More information about the llvm-commits mailing list