[llvm-commits] [llvm] r160254 - in /llvm/trunk: lib/Transforms/Instrumentation/AddressSanitizer.cpp test/Instrumentation/AddressSanitizer/basic.ll

Kostya Serebryany kcc at google.com
Mon Jul 16 06:34:56 PDT 2012


This change has been reverted by r160256 and asan bots are green, but the
problem still exists.

Current situation: given N memory access the generated code looks like:
if (*shadow1) __asan_report_read4(addr1); // no-return
if (*shadow2) __asan_report_read4(addr2); // no-return
if (*shadow3) __asan_report_read4(addr3); // no-return

This causes compile-time problems in CG and produces significant amount of
code (which is not hot, but hurts i-cache).
First Chandler's suggestion was to replace this code with

if (*shadow1) { arg = addr1; goto report_read4; }
if (*shadow2) { arg = addr2; goto report_read4; }
if (*shadow3) { arg = addr3; goto report_read4; }
return;
report_read4:
  __asan_report_read4(arg);

The problem here is that now there is a single __asan_report_read4 for
multiple memory accesses and asan run-time can not correctly report the top
stack frame
(it gets the PC using __builtin_return_address and then passes this PC to
the symbolizer).


Second Chandler's idea is to pass the PC as the second parameter
to __asan_report_read4:
if (*shadow1) { arg = addr1; pc = current_pc(); goto report_read4; }
if (*shadow2) { arg = addr2; pc = current_pc(); goto report_read4; }
if (*shadow3) { arg = addr3; pc = current_pc(); goto report_read4; }
return;
report_read4:
  __asan_report_read4(arg, pc);

The problems here:
  1. There is not LLVM intrinsic to get current_pc(). This is probably
solvable.
  2. This adds an instruction on the main path. Hard to say how much it
will cost, but not zero. I'll have to experiment.
  3. On x86_32 this is even worse, because one can not get the PC directly
from the register and this becomes prohibitively slow. We can probably use
a module offset instead of real PC... So, it will have to be reflected in
the appropriate LLVM intrinsic.

So, I'll implement the merging of __asan_report_* functions under a flag
and continue experimenting...

--kcc

On Mon, Jul 16, 2012 at 1:41 PM, Kostya Serebryany <kcc at google.com> wrote:

>
>
> On Mon, Jul 16, 2012 at 1:39 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>> FYI
>> This broke some of the asan tests because multiple calls to
>> __asan_report_* got merged into one.
>> Every call to __asan_report_* has unique debug info and merging them
>> leads to incorrect error messages.
>> Chandler and I are investigating.
>>
>> The simple way to reproduce:
>> % cd llvm/projects/compiler-rt/lib/asan
>> % clang -g  -O1 -faddress-sanitizer output_tests/global-overflow.cc &&
>> ./a.out 2>&1 | ./scripts/asan_symbolize.py /
>> Should print:
>>   #0 0x403420 in main global-overflow.cc:9
>> Instead it prints:
>>   #0 0x403420 in main global-overflow.cc:14
>>
>
> Another test case:
> clang -O1 -S -o - -faddress-sanitizer output_tests/large_func_test.cc |
> grep __asan_report
>
> Before the patch, it would print 21 lines with __asan_report_load4.
> Now it prints only 3 (hm, why not one!?).
>
> --kcc
>
>
>
>>
>>
>> Please avoid making other changes to AddressSanitizer.cpp before we
>> resolve this issue.
>>
>> --kcc
>>
>>
>> On Mon, Jul 16, 2012 at 12:58 PM, Chandler Carruth <chandlerc at gmail.com>wrote:
>>
>>> Author: chandlerc
>>> Date: Mon Jul 16 03:58:53 2012
>>> New Revision: 160254
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=160254&view=rev
>>> Log:
>>> Teach AddressSanitizer to create basic blocks in a more natural order.
>>> This is particularly useful to the backend code generators which try to
>>> process things in the incoming function order.
>>>
>>> Also, cleanup some uses of IRBuilder to be a bit simpler and more clear.
>>>
>>> Modified:
>>>     llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>>     llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>>
>>> Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=160254&r1=160253&r2=160254&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>> (original)
>>> +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Mon
>>> Jul 16 03:58:53 2012
>>> @@ -230,17 +230,17 @@
>>>  // Returns the ThenBlock's terminator.
>>>  static BranchInst *splitBlockAndInsertIfThen(Value *Cmp) {
>>>    Instruction *SplitBefore = cast<Instruction>(Cmp)->getNextNode();
>>> +
>>> +  // Create three basic blocks, with the middle block empty, by
>>> splitting twice.
>>>    BasicBlock *Head = SplitBefore->getParent();
>>> -  BasicBlock *Tail = Head->splitBasicBlock(SplitBefore);
>>> +  BasicBlock *Then = Head->splitBasicBlock(SplitBefore);
>>> +  BasicBlock *Tail = Then->splitBasicBlock(SplitBefore);
>>> +
>>>    TerminatorInst *HeadOldTerm = Head->getTerminator();
>>> -  LLVMContext &C = Head->getParent()->getParent()->getContext();
>>> -  BasicBlock *ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>>> -  BranchInst *HeadNewTerm =
>>> -    BranchInst::Create(/*ifTrue*/ThenBlock, /*ifFalse*/Tail, Cmp);
>>> -  ReplaceInstWithInst(HeadOldTerm, HeadNewTerm);
>>> +  IRBuilder<>(HeadOldTerm).CreateCondBr(Cmp, Then, Tail);
>>> +  HeadOldTerm->eraseFromParent();
>>>
>>> -  BranchInst *CheckTerm = BranchInst::Create(Tail, ThenBlock);
>>> -  return CheckTerm;
>>> +  return cast<BranchInst>(Then->getTerminator());
>>>  }
>>>
>>>  Value *AddressSanitizer::memToShadow(Value *Shadow, IRBuilder<> &IRB) {
>>> @@ -387,28 +387,28 @@
>>>    Value *Cmp = IRB.CreateICmpNE(ShadowValue, CmpVal);
>>>
>>>    Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>>> -  IRBuilder<> IRB2(CheckTerm);
>>> +  IRB.SetInsertPoint(CheckTerm);
>>>
>>>    size_t Granularity = 1 << MappingScale;
>>>    if (TypeSize < 8 * Granularity) {
>>>      // Addr & (Granularity - 1)
>>> -    Value *LastAccessedByte = IRB2.CreateAnd(
>>> +    Value *LastAccessedByte = IRB.CreateAnd(
>>>          AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>>>      // (Addr & (Granularity - 1)) + size - 1
>>>      if (TypeSize / 8 > 1)
>>> -      LastAccessedByte = IRB2.CreateAdd(
>>> +      LastAccessedByte = IRB.CreateAdd(
>>>            LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 -
>>> 1));
>>>      // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>>> -    LastAccessedByte = IRB2.CreateIntCast(
>>> +    LastAccessedByte = IRB.CreateIntCast(
>>>          LastAccessedByte, IRB.getInt8Ty(), false);
>>>      // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>>> -    Value *Cmp2 = IRB2.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>> +    Value *Cmp2 = IRB.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>>
>>>      CheckTerm = splitBlockAndInsertIfThen(Cmp2);
>>> +    IRB.SetInsertPoint(CheckTerm);
>>>    }
>>>
>>> -  IRBuilder<> IRB1(CheckTerm);
>>> -  Instruction *Crash = generateCrashCode(IRB1, AddrLong, IsWrite,
>>> TypeSize);
>>> +  Instruction *Crash = generateCrashCode(IRB, AddrLong, IsWrite,
>>> TypeSize);
>>>    Crash->setDebugLoc(OrigIns->getDebugLoc());
>>>    ReplaceInstWithInst(CheckTerm, new UnreachableInst(*C));
>>>  }
>>>
>>> Modified: llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll?rev=160254&r1=160253&r2=160254&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll (original)
>>> +++ llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll Mon Jul 16
>>> 03:58:53 2012
>>> @@ -16,11 +16,6 @@
>>>  ; CHECK:   icmp ne i8
>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>  ;
>>> -; The actual load comes next because ASan adds the last instrumentation
>>> block
>>> -; to the end of the function.
>>> -; CHECK:   %tmp1 = load i32* %a
>>> -; CHECK:   ret i32 %tmp1
>>> -;
>>>  ; First instrumentation block refines the shadow test.
>>>  ; CHECK:   and i64 %[[LOAD_ADDR]], 7
>>>  ; CHECK:   add i64 %{{.*}}, 3
>>> @@ -28,9 +23,13 @@
>>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[LOAD_SHADOW]]
>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>  ;
>>> -; Final instrumentation block reports the error.
>>> +; Second instrumentation block reports the error.
>>>  ; CHECK:   call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>>>  ; CHECK:   unreachable
>>> +;
>>> +; Finally the instrumented load.
>>> +; CHECK:   %tmp1 = load i32* %a
>>> +; CHECK:   ret i32 %tmp1
>>>
>>>  entry:
>>>    %tmp1 = load i32* %a
>>> @@ -48,11 +47,6 @@
>>>  ; CHECK:   icmp ne i8
>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>  ;
>>> -; The actual store comes next because ASan adds the last
>>> instrumentation block
>>> -; to the end of the function.
>>> -; CHECK:   store i32 42, i32* %a
>>> -; CHECK:   ret void
>>> -;
>>>  ; First instrumentation block refines the shadow test.
>>>  ; CHECK:   and i64 %[[STORE_ADDR]], 7
>>>  ; CHECK:   add i64 %{{.*}}, 3
>>> @@ -60,9 +54,13 @@
>>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[STORE_SHADOW]]
>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>  ;
>>> -; Final instrumentation block reports the error.
>>> +; Second instrumentation block reports the error.
>>>  ; CHECK:   call void @__asan_report_store4(i64 %[[STORE_ADDR]]) noreturn
>>>  ; CHECK:   unreachable
>>> +;
>>> +; Finally the instrumented store.
>>> +; CHECK:   store i32 42, i32* %a
>>> +; CHECK:   ret void
>>>
>>>  entry:
>>>    store i32 42, i32* %a
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120716/b674cf0b/attachment.html>


More information about the llvm-commits mailing list