[llvm-commits] [llvm] r160284 - in /llvm/trunk: lib/Transforms/Instrumentation/AddressSanitizer.cpp test/Instrumentation/AddressSanitizer/basic.ll
Chandler Carruth
chandlerc at google.com
Mon Jul 16 10:06:45 PDT 2012
On Mon, Jul 16, 2012 at 9:55 AM, Kostya Serebryany <kcc at google.com> wrote:
>
>
> On Mon, Jul 16, 2012 at 8:37 PM, Chandler Carruth <chandlerc at google.com>wrote:
>
>> On Mon, Jul 16, 2012 at 9:15 AM, Kostya Serebryany <kcc at google.com>wrote:
>>
>>> Author: kcc
>>> Date: Mon Jul 16 11:15:40 2012
>>> New Revision: 160284
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=160284&view=rev
>>> Log:
>>> [asan] refactor instrumentation to allow merging the crash callbacks
>>> (not fully implemented yet, no functionality change except the BB order)
>>>
>>
>> Uh... BB order is incredibly sensitive and significant, as we have
>> learned... =/ I'm beginning to wish we had kept my patch in now and built
>> upon it, as I think its simplifications would have been nice here.
>>
> That was not a choice unfortunately, as red bots will freeze our other
> work :(
>
>
>
>>
>> Should I go ahead and provide a patch that introduces probabilities so at
>> least performance remains stable as the basic block order shifts?
>>
>
> mmm. I am in the middle of more refactoring...
> I can add the probabilities myself (based on your patch), likely tomorrow.
>
>
>>
>>
>>> Modified:
>>> llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>> llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>>
>>> Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=160284&r1=160283&r2=160284&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>> (original)
>>> +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Mon
>>> Jul 16 11:15:40 2012
>>> @@ -72,6 +72,9 @@
>>> static const int kAsanStackRightRedzoneMagic = 0xf3;
>>> static const int kAsanStackPartialRedzoneMagic = 0xf4;
>>>
>>> +// Accesses sizes are powers of two: 1, 2, 4, 8, 16.
>>> +static const size_t kNumberOfAccessSizes = 5;
>>> +
>>> // Command-line flags.
>>>
>>> // This flag may need to be replaced with -f[no-]asan-reads.
>>> @@ -82,7 +85,10 @@
>>> static cl::opt<bool> ClInstrumentAtomics("asan-instrument-atomics",
>>> cl::desc("instrument atomic instructions (rmw, cmpxchg)"),
>>> cl::Hidden, cl::init(true));
>>> -// This flags limits the number of instructions to be instrumented
>>> +static cl::opt<bool> ClMergeCallbacks("asan-merge-callbacks",
>>> + cl::desc("merge __asan_report_ callbacks to create fewer BBs"),
>>> + cl::Hidden, cl::init(false));
>>> +// This flag limits the number of instructions to be instrumented
>>> // in any given BB. Normally, this should be set to unlimited (INT_MAX),
>>> // but due to http://llvm.org/bugs/show_bug.cgi?id=12652 we temporary
>>> // set it to 10000.
>>> @@ -138,18 +144,33 @@
>>>
>>> namespace {
>>>
>>> +/// An object of this type is created while instrumenting every
>>> function.
>>> +struct AsanFunctionContext {
>>> + AsanFunctionContext() {
>>> + memset(this, 0, sizeof(*this));
>>>
>>
>> This seems like a hack... Is there some reason it was necessary? That is,
>> why not:
>>
>> AsanFunctionContext() : CrashBlock() {}
>>
>
> Hm. Yea, should work. Will change in the next patch.
>
>
>>
>> Which should do the same thing?
>>
>>
>>> + }
>>> +
>>> + // These are initially zero. If we require at least one call to
>>> + // __asan_report_{read,write}{1,2,4,8,16}, an appropriate BB is
>>> created.
>>> + BasicBlock *CrashBlock[2][kNumberOfAccessSizes];
>>> +};
>>> +
>>> /// AddressSanitizer: instrument the code in module to find memory bugs.
>>> struct AddressSanitizer : public ModulePass {
>>> AddressSanitizer();
>>> virtual const char *getPassName() const;
>>> - void instrumentMop(Instruction *I);
>>> - void instrumentAddress(Instruction *OrigIns, IRBuilder<> &IRB,
>>> + void instrumentMop(AsanFunctionContext &AFC, Instruction *I);
>>> + void instrumentAddress(AsanFunctionContext &AFC,
>>> + Instruction *OrigIns, IRBuilder<> &IRB,
>>> Value *Addr, uint32_t TypeSize, bool IsWrite);
>>> - Instruction *generateCrashCode(IRBuilder<> &IRB, Value *Addr,
>>> + Value *createSlowPathCmp(IRBuilder<> &IRB, Value *AddrLong,
>>> + Value *ShadowValue, uint32_t TypeSize);
>>> + Instruction *generateCrashCode(BasicBlock *BB, Value *Addr,
>>> bool IsWrite, uint32_t TypeSize);
>>> - bool instrumentMemIntrinsic(MemIntrinsic *MI);
>>> - void instrumentMemIntrinsicParam(Instruction *OrigIns, Value *Addr,
>>> - Value *Size,
>>> + bool instrumentMemIntrinsic(AsanFunctionContext &AFC, MemIntrinsic
>>> *MI);
>>> + void instrumentMemIntrinsicParam(AsanFunctionContext &AFC,
>>> + Instruction *OrigIns, Value *Addr,
>>> + Value *Size,
>>>
>>
>> All of this churn makes me think that this pass should really be a
>> function pass rather than a module pass, potentially depending on a module
>> pass running first. Would such a split make sense?
>>
>
> Yes, definitely. That's what we already have in tsan.
>
>
>> Doesn't have to happen now, but I think it would make the organization of
>> the instrumentation passes much more clear.
>>
>> Eventually, it will also have a performance impact -- there is some hope
>> that we will be able to parallelize function passes. Anyways, that's a
>> long-term thing.
>>
>>
>>> Instruction *InsertBefore, bool
>>> IsWrite);
>>> Value *memToShadow(Value *Shadow, IRBuilder<> &IRB);
>>> bool handleFunction(Module &M, Function &F);
>>> @@ -192,11 +213,10 @@
>>> Function *AsanInitFunction;
>>> Instruction *CtorInsertBefore;
>>> OwningPtr<FunctionBlackList> BL;
>>> - // Accesses sizes are powers of two: 1, 2, 4, 8, 16.
>>> - static const size_t kNumberOfAccessSizes = 5;
>>> // This array is indexed by AccessIsWrite and log2(AccessSize).
>>> Function *AsanErrorCallback[2][kNumberOfAccessSizes];
>>> };
>>> +
>>> } // namespace
>>>
>>> char AddressSanitizer::ID = 0;
>>> @@ -230,19 +250,24 @@
>>> // ThenBlock
>>> // Tail
>>> //
>>> -// Returns the ThenBlock's terminator.
>>> -static BranchInst *splitBlockAndInsertIfThen(Value *Cmp) {
>>> +// If ThenBlock is zero, a new block is created and its terminator is
>>> returned.
>>> +// Otherwize NULL is returned.
>>> +static BranchInst *splitBlockAndInsertIfThen(Value *Cmp,
>>> + BasicBlock *ThenBlock = 0)
>>> {
>>> Instruction *SplitBefore = cast<Instruction>(Cmp)->getNextNode();
>>> BasicBlock *Head = SplitBefore->getParent();
>>> BasicBlock *Tail = Head->splitBasicBlock(SplitBefore);
>>> TerminatorInst *HeadOldTerm = Head->getTerminator();
>>> - LLVMContext &C = Head->getParent()->getParent()->getContext();
>>> - BasicBlock *ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>>> + BranchInst *CheckTerm = NULL;
>>>
>>
>> LLVM style is to use '0' rather than NULL....
>>
>
> Hm? I can't see such a rule in http://llvm.org/docs/CodingStandards.html (not
> that I care much)
>
Lots of the little stuff isn't there... not that I care much either... ;]
>
>
>>
>>
>>> + if (!ThenBlock) {
>>> + LLVMContext &C = Head->getParent()->getParent()->getContext();
>>>
>>
>> This should really just be a method, it would simplify stuff such as
>> getting the context.
>>
>
> It was a method, and then I've made it a function and wanted to use it in
> msan. Oh my...
>
I think we're getting close to needing a CFGBuilder which encapsulates a
lot of convenience logic for building of these structures...
>
>
>>
>>
>>> + ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>>> + CheckTerm = BranchInst::Create(Tail, ThenBlock);
>>> + }
>>> BranchInst *HeadNewTerm =
>>> BranchInst::Create(/*ifTrue*/ThenBlock, /*ifFalse*/Tail, Cmp);
>>> ReplaceInstWithInst(HeadOldTerm, HeadNewTerm);
>>>
>>> - BranchInst *CheckTerm = BranchInst::Create(Tail, ThenBlock);
>>> return CheckTerm;
>>> }
>>>
>>> @@ -256,12 +281,13 @@
>>> MappingOffset));
>>> }
>>>
>>> -void AddressSanitizer::instrumentMemIntrinsicParam(Instruction *OrigIns,
>>> +void AddressSanitizer::instrumentMemIntrinsicParam(
>>> + AsanFunctionContext &AFC, Instruction *OrigIns,
>>> Value *Addr, Value *Size, Instruction *InsertBefore, bool IsWrite) {
>>> // Check the first byte.
>>> {
>>> IRBuilder<> IRB(InsertBefore);
>>> - instrumentAddress(OrigIns, IRB, Addr, 8, IsWrite);
>>> + instrumentAddress(AFC, OrigIns, IRB, Addr, 8, IsWrite);
>>> }
>>> // Check the last byte.
>>> {
>>> @@ -271,12 +297,13 @@
>>> SizeMinusOne = IRB.CreateIntCast(SizeMinusOne, IntptrTy, false);
>>> Value *AddrLong = IRB.CreatePointerCast(Addr, IntptrTy);
>>> Value *AddrPlusSizeMinisOne = IRB.CreateAdd(AddrLong, SizeMinusOne);
>>> - instrumentAddress(OrigIns, IRB, AddrPlusSizeMinisOne, 8, IsWrite);
>>> + instrumentAddress(AFC, OrigIns, IRB, AddrPlusSizeMinisOne, 8,
>>> IsWrite);
>>> }
>>> }
>>>
>>> // Instrument memset/memmove/memcpy
>>> -bool AddressSanitizer::instrumentMemIntrinsic(MemIntrinsic *MI) {
>>> +bool AddressSanitizer::instrumentMemIntrinsic(AsanFunctionContext &AFC,
>>> + MemIntrinsic *MI) {
>>> Value *Dst = MI->getDest();
>>> MemTransferInst *MemTran = dyn_cast<MemTransferInst>(MI);
>>> Value *Src = MemTran ? MemTran->getSource() : NULL;
>>> @@ -295,9 +322,9 @@
>>> InsertBefore = splitBlockAndInsertIfThen(Cmp);
>>> }
>>>
>>> - instrumentMemIntrinsicParam(MI, Dst, Length, InsertBefore, true);
>>> + instrumentMemIntrinsicParam(AFC, MI, Dst, Length, InsertBefore, true);
>>> if (Src)
>>> - instrumentMemIntrinsicParam(MI, Src, Length, InsertBefore, false);
>>> + instrumentMemIntrinsicParam(AFC, MI, Src, Length, InsertBefore,
>>> false);
>>> return true;
>>> }
>>>
>>> @@ -327,7 +354,7 @@
>>> return NULL;
>>> }
>>>
>>> -void AddressSanitizer::instrumentMop(Instruction *I) {
>>> +void AddressSanitizer::instrumentMop(AsanFunctionContext &AFC,
>>> Instruction *I) {
>>> bool IsWrite;
>>> Value *Addr = isInterestingMemoryAccess(I, &IsWrite);
>>> assert(Addr);
>>> @@ -348,7 +375,7 @@
>>> }
>>>
>>> IRBuilder<> IRB(I);
>>> - instrumentAddress(I, IRB, Addr, TypeSize, IsWrite);
>>> + instrumentAddress(AFC, I, IRB, Addr, TypeSize, IsWrite);
>>> }
>>>
>>> // Validate the result of Module::getOrInsertFunction called for an
>>> interface
>>> @@ -363,7 +390,8 @@
>>> }
>>>
>>> Instruction *AddressSanitizer::generateCrashCode(
>>> - IRBuilder<> &IRB, Value *Addr, bool IsWrite, uint32_t TypeSize) {
>>> + BasicBlock *BB, Value *Addr, bool IsWrite, uint32_t TypeSize) {
>>> + IRBuilder<> IRB(BB->getFirstNonPHI());
>>> size_t AccessSizeIndex = CountTrailingZeros_32(TypeSize / 8);
>>> assert(AccessSizeIndex < kNumberOfAccessSizes);
>>> CallInst *Call =
>>> IRB.CreateCall(AsanErrorCallback[IsWrite][AccessSizeIndex],
>>> @@ -372,7 +400,26 @@
>>> return Call;
>>> }
>>>
>>> -void AddressSanitizer::instrumentAddress(Instruction *OrigIns,
>>> +Value * AddressSanitizer::createSlowPathCmp(IRBuilder<> &IRB, Value
>>> *AddrLong,
>>>
>>
>> No ' ' on both sides of the '*' here: Value *AddressSanitizer::...
>>
> Thanks, will fix with next patch.
>
>
>>
>>> + Value *ShadowValue,
>>> + uint32_t TypeSize) {
>>> + size_t Granularity = 1 << MappingScale;
>>> + // Addr & (Granularity - 1)
>>> + Value *LastAccessedByte = IRB.CreateAnd(
>>> + AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>>> + // (Addr & (Granularity - 1)) + size - 1
>>> + if (TypeSize / 8 > 1)
>>> + LastAccessedByte = IRB.CreateAdd(
>>> + LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 - 1));
>>> + // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>>> + LastAccessedByte = IRB.CreateIntCast(
>>> + LastAccessedByte, IRB.getInt8Ty(), false);
>>> + // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>>> + return IRB.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>> +}
>>> +
>>> +void AddressSanitizer::instrumentAddress(AsanFunctionContext &AFC,
>>> + Instruction *OrigIns,
>>> IRBuilder<> &IRB, Value *Addr,
>>> uint32_t TypeSize, bool
>>> IsWrite) {
>>> Value *AddrLong = IRB.CreatePointerCast(Addr, IntptrTy);
>>> @@ -387,31 +434,27 @@
>>>
>>> Value *Cmp = IRB.CreateICmpNE(ShadowValue, CmpVal);
>>>
>>> - Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>>> - IRBuilder<> IRB2(CheckTerm);
>>> + BasicBlock *CrashBlock = NULL;
>>> + if (ClMergeCallbacks) {
>>> + llvm_unreachable("unimplemented yet");
>>> + } else {
>>> + CrashBlock = BasicBlock::Create(*C, "crash_bb",
>>> + OrigIns->getParent()->getParent());
>>> + new UnreachableInst(*C, CrashBlock);
>>> + Instruction *Crash =
>>> + generateCrashCode(CrashBlock, AddrLong, IsWrite, TypeSize);
>>> + Crash->setDebugLoc(OrigIns->getDebugLoc());
>>> + }
>>>
>>> size_t Granularity = 1 << MappingScale;
>>> if (TypeSize < 8 * Granularity) {
>>> - // Addr & (Granularity - 1)
>>> - Value *LastAccessedByte = IRB2.CreateAnd(
>>> - AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>>> - // (Addr & (Granularity - 1)) + size - 1
>>> - if (TypeSize / 8 > 1)
>>> - LastAccessedByte = IRB2.CreateAdd(
>>> - LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 -
>>> 1));
>>> - // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>>> - LastAccessedByte = IRB2.CreateIntCast(
>>> - LastAccessedByte, IRB.getInt8Ty(), false);
>>> - // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>>> - Value *Cmp2 = IRB2.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>> -
>>> - CheckTerm = splitBlockAndInsertIfThen(Cmp2);
>>> - }
>>> -
>>> - IRBuilder<> IRB1(CheckTerm);
>>> - Instruction *Crash = generateCrashCode(IRB1, AddrLong, IsWrite,
>>> TypeSize);
>>> - Crash->setDebugLoc(OrigIns->getDebugLoc());
>>> - ReplaceInstWithInst(CheckTerm, new UnreachableInst(*C));
>>> + Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>>> + IRB.SetInsertPoint(CheckTerm);
>>> + Value *Cmp2 = createSlowPathCmp(IRB, AddrLong, ShadowValue,
>>> TypeSize);
>>> + splitBlockAndInsertIfThen(Cmp2, CrashBlock);
>>> + } else {
>>> + splitBlockAndInsertIfThen(Cmp, CrashBlock);
>>> + }
>>> }
>>>
>>> // This function replaces all global variables with new variables that
>>> have
>>> @@ -733,6 +776,8 @@
>>> }
>>> }
>>>
>>> + AsanFunctionContext AFC;
>>> +
>>> // Instrument.
>>> int NumInstrumented = 0;
>>> for (size_t i = 0, n = ToInstrument.size(); i != n; i++) {
>>> @@ -740,9 +785,9 @@
>>> if (ClDebugMin < 0 || ClDebugMax < 0 ||
>>> (NumInstrumented >= ClDebugMin && NumInstrumented <=
>>> ClDebugMax)) {
>>> if (isInterestingMemoryAccess(Inst, &IsWrite))
>>> - instrumentMop(Inst);
>>> + instrumentMop(AFC, Inst);
>>> else
>>> - instrumentMemIntrinsic(cast<MemIntrinsic>(Inst));
>>> + instrumentMemIntrinsic(AFC, cast<MemIntrinsic>(Inst));
>>> }
>>> NumInstrumented++;
>>> }
>>>
>>> Modified: llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll?rev=160284&r1=160283&r2=160284&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll (original)
>>> +++ llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll Mon Jul 16
>>> 11:15:40 2012
>>> @@ -20,6 +20,10 @@
>>> ; to the end of the function.
>>> ; CHECK: %tmp1 = load i32* %a
>>> ; CHECK: ret i32 %tmp1
>>> +
>>> +; The crash block reports the error.
>>> +; CHECK: call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>>> +; CHECK: unreachable
>>> ;
>>> ; First instrumentation block refines the shadow test.
>>> ; CHECK: and i64 %[[LOAD_ADDR]], 7
>>> @@ -28,9 +32,6 @@
>>> ; CHECK: icmp sge i8 %{{.*}}, %[[LOAD_SHADOW]]
>>> ; CHECK: br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>> ;
>>> -; Final instrumentation block reports the error.
>>> -; CHECK: call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>>> -; CHECK: unreachable
>>>
>>> entry:
>>> %tmp1 = load i32* %a
>>> @@ -53,6 +54,10 @@
>>> ; CHECK: store i32 42, i32* %a
>>> ; CHECK: ret void
>>> ;
>>> +; The crash block reports the error.
>>> +; CHECK: call void @__asan_report_store4(i64 %[[STORE_ADDR]]) noreturn
>>> +; CHECK: unreachable
>>> +;
>>> ; First instrumentation block refines the shadow test.
>>> ; CHECK: and i64 %[[STORE_ADDR]], 7
>>> ; CHECK: add i64 %{{.*}}, 3
>>> @@ -60,9 +65,6 @@
>>> ; CHECK: icmp sge i8 %{{.*}}, %[[STORE_SHADOW]]
>>> ; CHECK: br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>> ;
>>> -; Final instrumentation block reports the error.
>>> -; CHECK: call void @__asan_report_store4(i64 %[[STORE_ADDR]]) noreturn
>>> -; CHECK: unreachable
>>>
>>> entry:
>>> store i32 42, i32* %a
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120716/e9e58915/attachment.html>
More information about the llvm-commits
mailing list