[llvm-commits] [llvm] r160254 - in /llvm/trunk: lib/Transforms/Instrumentation/AddressSanitizer.cpp test/Instrumentation/AddressSanitizer/basic.ll

Tue Jul 17 04:53:47 PDT 2012

Chandler,
I've committed the changes that merge asan crash callbacks (under a flag).

% cat ~/tmp/store2.cc
void foo(int *a, int *b) {
  *a = 0;
  *b = 1;
}
% clang -O2 -faddress-sanitizer -S -o - -mllvm -asan-merge-callbacks=1
 ~/tmp/store2.cc -emit-llvm 2>&1
...
crash_bb-w-4:                                     ; preds = %16, %21
  %14 = phi i64 [ %7, %21 ], [ %0, %16 ]
  %15 = phi i64 [ 846930886, %21 ], [ 1804289383, %16 ]
  call void @__asan_report_store4(i64 %14, i64 %15) noreturn nounwind
  unreachable
...

This produces 5% bigger code:
1.08 400.perlbench
1.08 401.bzip2
1.07 403.gcc
1.00 429.mcf
1.01 445.gobmk
1.08 456.hmmer
1.03 458.sjeng
1.02 462.libquantum
1.08 464.h264ref
1.05 473.astar
1.07 483.xalancbmk
1.04 433.milc
1.08 444.namd
1.07 447.dealII
1.08 450.soplex
1.07 453.povray
1.00 470.lbm
1.06 482.sphinx3

This also produces 1%-4% slower code (I've run only 3 benchmarks)
       400.perlbench,      1061.00,      1072.00,         1.01
           401.bzip2,       910.00,       923.00,         1.01
             403.gcc,       623.00,       645.00,         1.04

Besides, the reproducer from http://llvm.org/bugs/show_bug.cgi?id=13225 builds
3x slower!
I guess that these extra PHI nodes produce much more stress for CG than the
extra BBs.
% time clang -O2 -c  -w -mllvm -asan-merge-callbacks=0  -mllvm
-asan-max-ins-per-bb=99999999   -faddress-sanitizer big-global.cc
TIME: real: 8.970; user: 8.690; system: 0.100
% time clang -O2 -c  -w -mllvm -asan-merge-callbacks=1  -mllvm
-asan-max-ins-per-bb=99999999   -faddress-sanitizer big-global.cc
TIME: real: 25.629; user: 25.350; system: 0.080
%


The current implementation passes a random constant instead of the PC to
the callback.
Getting the real PC might end up even more expensive.
We can not get the PC on x86_32 at all, so we will need non-trivial changes
in the run-time to accept the module offset.

Finally, this adds 150 lines to the asan compiler module.

So, I am tempted to rollback all these changes.
WDYT?

There is another problem: we are avoiding the callback merging by sheer
luck.
A tiny change in splitBlockAndInsertIfThen will cause it to happen and will
break asan error reports:

--- lib/Transforms/Instrumentation/AddressSanitizer.cpp (revision 160361)
+++ lib/Transforms/Instrumentation/AddressSanitizer.cpp (working copy)
@@ -276,7 +276,7 @@
   BranchInst *CheckTerm = 0;
   if (!ThenBlock) {
     LLVMContext &C = Head->getParent()->getParent()->getContext();
-    ThenBlock = BasicBlock::Create(C, "", Head->getParent());
+    ThenBlock = BasicBlock::Create(C, "", Head->getParent(), Tail);
     CheckTerm = BranchInst::Create(Tail, ThenBlock);
   }
   BranchInst *HeadNewTerm =

Any suggestions?

--kcc

On Mon, Jul 16, 2012 at 5:34 PM, Kostya Serebryany <kcc at google.com> wrote:

> This change has been reverted by r160256 and asan bots are green, but the
> problem still exists.
>
> Current situation: given N memory access the generated code looks like:
> if (*shadow1) __asan_report_read4(addr1); // no-return
> if (*shadow2) __asan_report_read4(addr2); // no-return
> if (*shadow3) __asan_report_read4(addr3); // no-return
>
> This causes compile-time problems in CG and produces significant amount of
> code (which is not hot, but hurts i-cache).
> First Chandler's suggestion was to replace this code with
>
> if (*shadow1) { arg = addr1; goto report_read4; }
> if (*shadow2) { arg = addr2; goto report_read4; }
> if (*shadow3) { arg = addr3; goto report_read4; }
> return;
> report_read4:
>   __asan_report_read4(arg);
>
> The problem here is that now there is a single __asan_report_read4 for
> multiple memory accesses and asan run-time can not correctly report the top
> stack frame
> (it gets the PC using __builtin_return_address and then passes this PC to
> the symbolizer).
>
>
> Second Chandler's idea is to pass the PC as the second parameter
> to __asan_report_read4:
> if (*shadow1) { arg = addr1; pc = current_pc(); goto report_read4; }
> if (*shadow2) { arg = addr2; pc = current_pc(); goto report_read4; }
> if (*shadow3) { arg = addr3; pc = current_pc(); goto report_read4; }
> return;
> report_read4:
>   __asan_report_read4(arg, pc);
>
> The problems here:
>   1. There is not LLVM intrinsic to get current_pc(). This is probably
> solvable.
>   2. This adds an instruction on the main path. Hard to say how much it
> will cost, but not zero. I'll have to experiment.
>   3. On x86_32 this is even worse, because one can not get the PC directly
> from the register and this becomes prohibitively slow. We can probably use
> a module offset instead of real PC... So, it will have to be reflected in
> the appropriate LLVM intrinsic.
>
> So, I'll implement the merging of __asan_report_* functions under a flag
> and continue experimenting...
>
> --kcc
>
> On Mon, Jul 16, 2012 at 1:41 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>>
>>
>> On Mon, Jul 16, 2012 at 1:39 PM, Kostya Serebryany <kcc at google.com>wrote:
>>
>>> FYI
>>> This broke some of the asan tests because multiple calls to
>>> __asan_report_* got merged into one.
>>> Every call to __asan_report_* has unique debug info and merging them
>>> leads to incorrect error messages.
>>> Chandler and I are investigating.
>>>
>>> The simple way to reproduce:
>>> % cd llvm/projects/compiler-rt/lib/asan
>>> % clang -g  -O1 -faddress-sanitizer output_tests/global-overflow.cc &&
>>> ./a.out 2>&1 | ./scripts/asan_symbolize.py /
>>> Should print:
>>>   #0 0x403420 in main global-overflow.cc:9
>>> Instead it prints:
>>>   #0 0x403420 in main global-overflow.cc:14
>>>
>>
>> Another test case:
>> clang -O1 -S -o - -faddress-sanitizer output_tests/large_func_test.cc |
>> grep __asan_report
>>
>> Before the patch, it would print 21 lines with __asan_report_load4.
>> Now it prints only 3 (hm, why not one!?).
>>
>> --kcc
>>
>>
>>
>>>
>>>
>>> Please avoid making other changes to AddressSanitizer.cpp before we
>>> resolve this issue.
>>>
>>> --kcc
>>>
>>>
>>> On Mon, Jul 16, 2012 at 12:58 PM, Chandler Carruth <chandlerc at gmail.com>wrote:
>>>
>>>> Author: chandlerc
>>>> Date: Mon Jul 16 03:58:53 2012
>>>> New Revision: 160254
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=160254&view=rev
>>>> Log:
>>>> Teach AddressSanitizer to create basic blocks in a more natural order.
>>>> This is particularly useful to the backend code generators which try to
>>>> process things in the incoming function order.
>>>>
>>>> Also, cleanup some uses of IRBuilder to be a bit simpler and more clear.
>>>>
>>>> Modified:
>>>>     llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>>>     llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>>>
>>>> Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>>> URL:
>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=160254&r1=160253&r2=160254&view=diff
>>>>
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp
>>>> (original)
>>>> +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Mon
>>>> Jul 16 03:58:53 2012
>>>> @@ -230,17 +230,17 @@
>>>>  // Returns the ThenBlock's terminator.
>>>>  static BranchInst *splitBlockAndInsertIfThen(Value *Cmp) {
>>>>    Instruction *SplitBefore = cast<Instruction>(Cmp)->getNextNode();
>>>> +
>>>> +  // Create three basic blocks, with the middle block empty, by
>>>> splitting twice.
>>>>    BasicBlock *Head = SplitBefore->getParent();
>>>> -  BasicBlock *Tail = Head->splitBasicBlock(SplitBefore);
>>>> +  BasicBlock *Then = Head->splitBasicBlock(SplitBefore);
>>>> +  BasicBlock *Tail = Then->splitBasicBlock(SplitBefore);
>>>> +
>>>>    TerminatorInst *HeadOldTerm = Head->getTerminator();
>>>> -  LLVMContext &C = Head->getParent()->getParent()->getContext();
>>>> -  BasicBlock *ThenBlock = BasicBlock::Create(C, "", Head->getParent());
>>>> -  BranchInst *HeadNewTerm =
>>>> -    BranchInst::Create(/*ifTrue*/ThenBlock, /*ifFalse*/Tail, Cmp);
>>>> -  ReplaceInstWithInst(HeadOldTerm, HeadNewTerm);
>>>> +  IRBuilder<>(HeadOldTerm).CreateCondBr(Cmp, Then, Tail);
>>>> +  HeadOldTerm->eraseFromParent();
>>>>
>>>> -  BranchInst *CheckTerm = BranchInst::Create(Tail, ThenBlock);
>>>> -  return CheckTerm;
>>>> +  return cast<BranchInst>(Then->getTerminator());
>>>>  }
>>>>
>>>>  Value *AddressSanitizer::memToShadow(Value *Shadow, IRBuilder<> &IRB) {
>>>> @@ -387,28 +387,28 @@
>>>>    Value *Cmp = IRB.CreateICmpNE(ShadowValue, CmpVal);
>>>>
>>>>    Instruction *CheckTerm = splitBlockAndInsertIfThen(Cmp);
>>>> -  IRBuilder<> IRB2(CheckTerm);
>>>> +  IRB.SetInsertPoint(CheckTerm);
>>>>
>>>>    size_t Granularity = 1 << MappingScale;
>>>>    if (TypeSize < 8 * Granularity) {
>>>>      // Addr & (Granularity - 1)
>>>> -    Value *LastAccessedByte = IRB2.CreateAnd(
>>>> +    Value *LastAccessedByte = IRB.CreateAnd(
>>>>          AddrLong, ConstantInt::get(IntptrTy, Granularity - 1));
>>>>      // (Addr & (Granularity - 1)) + size - 1
>>>>      if (TypeSize / 8 > 1)
>>>> -      LastAccessedByte = IRB2.CreateAdd(
>>>> +      LastAccessedByte = IRB.CreateAdd(
>>>>            LastAccessedByte, ConstantInt::get(IntptrTy, TypeSize / 8 -
>>>> 1));
>>>>      // (uint8_t) ((Addr & (Granularity-1)) + size - 1)
>>>> -    LastAccessedByte = IRB2.CreateIntCast(
>>>> +    LastAccessedByte = IRB.CreateIntCast(
>>>>          LastAccessedByte, IRB.getInt8Ty(), false);
>>>>      // ((uint8_t) ((Addr & (Granularity-1)) + size - 1)) >= ShadowValue
>>>> -    Value *Cmp2 = IRB2.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>>> +    Value *Cmp2 = IRB.CreateICmpSGE(LastAccessedByte, ShadowValue);
>>>>
>>>>      CheckTerm = splitBlockAndInsertIfThen(Cmp2);
>>>> +    IRB.SetInsertPoint(CheckTerm);
>>>>    }
>>>>
>>>> -  IRBuilder<> IRB1(CheckTerm);
>>>> -  Instruction *Crash = generateCrashCode(IRB1, AddrLong, IsWrite,
>>>> TypeSize);
>>>> +  Instruction *Crash = generateCrashCode(IRB, AddrLong, IsWrite,
>>>> TypeSize);
>>>>    Crash->setDebugLoc(OrigIns->getDebugLoc());
>>>>    ReplaceInstWithInst(CheckTerm, new UnreachableInst(*C));
>>>>  }
>>>>
>>>> Modified: llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll
>>>> URL:
>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll?rev=160254&r1=160253&r2=160254&view=diff
>>>>
>>>> ==============================================================================
>>>> --- llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll (original)
>>>> +++ llvm/trunk/test/Instrumentation/AddressSanitizer/basic.ll Mon Jul
>>>> 16 03:58:53 2012
>>>> @@ -16,11 +16,6 @@
>>>>  ; CHECK:   icmp ne i8
>>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>>  ;
>>>> -; The actual load comes next because ASan adds the last
>>>> instrumentation block
>>>> -; to the end of the function.
>>>> -; CHECK:   %tmp1 = load i32* %a
>>>> -; CHECK:   ret i32 %tmp1
>>>> -;
>>>>  ; First instrumentation block refines the shadow test.
>>>>  ; CHECK:   and i64 %[[LOAD_ADDR]], 7
>>>>  ; CHECK:   add i64 %{{.*}}, 3
>>>> @@ -28,9 +23,13 @@
>>>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[LOAD_SHADOW]]
>>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>>  ;
>>>> -; Final instrumentation block reports the error.
>>>> +; Second instrumentation block reports the error.
>>>>  ; CHECK:   call void @__asan_report_load4(i64 %[[LOAD_ADDR]]) noreturn
>>>>  ; CHECK:   unreachable
>>>> +;
>>>> +; Finally the instrumented load.
>>>> +; CHECK:   %tmp1 = load i32* %a
>>>> +; CHECK:   ret i32 %tmp1
>>>>
>>>>  entry:
>>>>    %tmp1 = load i32* %a
>>>> @@ -48,11 +47,6 @@
>>>>  ; CHECK:   icmp ne i8
>>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>>  ;
>>>> -; The actual store comes next because ASan adds the last
>>>> instrumentation block
>>>> -; to the end of the function.
>>>> -; CHECK:   store i32 42, i32* %a
>>>> -; CHECK:   ret void
>>>> -;
>>>>  ; First instrumentation block refines the shadow test.
>>>>  ; CHECK:   and i64 %[[STORE_ADDR]], 7
>>>>  ; CHECK:   add i64 %{{.*}}, 3
>>>> @@ -60,9 +54,13 @@
>>>>  ; CHECK:   icmp sge i8 %{{.*}}, %[[STORE_SHADOW]]
>>>>  ; CHECK:   br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
>>>>  ;
>>>> -; Final instrumentation block reports the error.
>>>> +; Second instrumentation block reports the error.
>>>>  ; CHECK:   call void @__asan_report_store4(i64 %[[STORE_ADDR]])
>>>> noreturn
>>>>  ; CHECK:   unreachable
>>>> +;
>>>> +; Finally the instrumented store.
>>>> +; CHECK:   store i32 42, i32* %a
>>>> +; CHECK:   ret void
>>>>
>>>>  entry:
>>>>    store i32 42, i32* %a
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120717/420cd3f4/attachment.html>