[llvm-commits] AddressSanitizer, first patch

Thu Sep 15 13:31:03 PDT 2011

Hi,

I haven't heard any more comments.
Eric Christopher confirmed that the only concern is having the
instrumentation pass w/o the run-time library.
I do want to contribute run-time as well.
How do we proceed?

Let me describe the whole thing once more.

AddressSanitizer finds
  - Out-of-bounds accesses to heap, stack and globals
  - Use-after-free (heap)
The slowdown is typically within 2x:
http://code.google.com/p/address-sanitizer/wiki/PerformanceNumbers
It also has an experimental feature to detect use-after-return (stack).

The tool currently works on x86 Linux and Mac, both 32- and 64- bits.
The Linux port is absolutely stable (one can browse the web using
instrumented Chromium for hours).
The Mac port is slightly less stable (Chromium tests work, but the browser
hiccups infrequently).

The tool consists of two parts: instrumentation pass (~1KLOC) and run-time
library (~3KLOC).

The instrumentation pass being reviewed in this mail thread only supports
detection of heap bugs.
Other parts (~400 more LOC) will add stack, globals, and blacklisting.

The run-time library replaces malloc/free/etc, intercepts other libc
functions (e.g. pthread_create) and does error reporting.
The library and test code uses google coding conventions.
The integration tests use gtest.

I'd like to have the run-time available as part of LLVM, and I would agree
on any reasonable way of doing so in the following order of preference:
  - use svn:externals to point to the existing repository at
http://code.google.com/p/address-sanitizer/source/checkout
  - copy the code to LLVM repository and periodically take the fresh
versions from c.g.c repository.
  - use the LLVM repository as the primary repository (here I am concerned
with the tool chain: we are too used to the features provided by c.g.c.)
Please suggest what suits LLVM the most.

Thanks,

--kcc

On Mon, Sep 12, 2011 at 11:48 AM, Kostya Serebryany <kcc at google.com> wrote:

> FYI
> I've added noreturn attribute to the __asan_report_error call. It doesn't
> cause compile problems in cpu2006 any more.
> The good news is that the slowdown (compared to assembly with ud2) on
> cpu2006 ranges between 1% and 8%.
> And we may recover from it if __asan_report_error does not cause stack
> frame to be created in leaf functions.
>
> --kcc
>
>
> On Fri, Sep 9, 2011 at 12:48 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>>
>>
>> On Thu, Sep 8, 2011 at 5:38 PM, Eric Christopher <echristo at apple.com>wrote:
>>
>>>
>>> On Sep 8, 2011, at 5:14 PM, Kostya Serebryany wrote:
>>>
>>>
>>>
>>> On Thu, Sep 8, 2011 at 4:43 PM, Eric Christopher <echristo at apple.com>wrote:
>>>
>>>>
>>>> On Sep 8, 2011, at 4:35 PM, Chandler Carruth wrote:
>>>>
>>>> > FYI,
>>>> >
>>>> > I've looked again at this patch recently, and talked to several others
>>>> about it...
>>>> >
>>>> > I think there are some big issues with the implementation, bit I think
>>>> they could probably be addressed. The one that I see the most is the x86
>>>> assembly just being dumped as a blob. I think that really needs to be
>>>> resolved before this can move forward.
>>>> >
>>>> > It seems like this should be do-able in some way with the llvm.trap
>>>> intrinsic. If anything, we might need another intrinsic or an extension of
>>>> llvm.trap that allows feeding the SIGILL signal handler the data arguments
>>>> it needs; I'd need to read up on exactly what is required for that to figure
>>>> out how best to represent that in IR, but it shouldn't be that hard.
>>>> >
>>>> > That said, maybe to improve the discussion of the patch and make
>>>> reviewing it easier, you could just switch to emitting the call to the
>>>> runtime library function in all cases? That should be a strictly simpler
>>>> patch, and then how to improve performance by avoiding the runtime call can
>>>> be a followup.
>>>> >
>>>>
>>>> This encompasses most of my concerns with the patch.
>>>>
>>>> > Maybe others could comment on other implementation issues? I'm sadly
>>>> not an expert at LLVM...
>>>>
>>>> In a glance I didn't see a lot, but the other bits were somewhat hiding
>>>> it. How about when we get a new patch we can take a bit more of a look at
>>>> it.
>>>>
>>>
>>> I dropped the assembly part and updated the patch at
>>> http://codereview.appspot.com/4867059/ (also in attachment).
>>> We can figure out how to make it more efficient later (something
>>> like llvm.trap intrinsic with a parameter).
>>>
>>> This patch is 570 LOC plus few lines of trivial changes and a test.
>>> It implements only basic functionality: detection of out-of-bounds and
>>> use-after-free in heap.
>>>
>>> The complete version of LLVM pass is ~1000 LOC (
>>> http://code.google.com/p/address-sanitizer/source/browse/trunk/llvm/AddressSanitizer.cpp
>>> ).
>>> It also handles out-of-bounds for stack and global objects and (recently
>>> added) stack use-after-return bugs.
>>>
>>>
>>> Some somewhat trivial issues:
>>>
>>> a) Please go ahead and hide the non-user visible flags.
>>>
>> done
>>
>>>
>>> b) Not a big fan of this style of avoid 80-col wrap:
>>>
>>> +Instruction *AddressSanitizer::generateCrashCode(
>>> +    IRBuilder<> &IRB, Value *Addr, int TelltaleValue) {
>>>
>> done
>>
>>>
>>> c) +  // which will be lowered to a cusom assembly.
>>>
>>> s/cusom/custom
>>>
>>> d)
>>>
>>> +  if (TypeSize != 8 && TypeSize != 16 &&
>>> +      TypeSize != 32 && TypeSize != 64 && TypeSize != 128) {
>>> +    // TODO(kcc): do something better.
>>> +    return;
>>> +  }
>>>
>>> What's the reason behind this?
>>>
>>
>> Changed the comment.
>>
>>
>>>
>>> Perhaps not so trivial:
>>>
>>> a) Where are the asan_* functions? Where are they supposed to live?
>>>
>>> I'll probably have more, but it's a start and the discussion on the
>>> library functions probably should happen sooner rather than later.
>>>
>>> -eric
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110915/29dd3eb5/attachment.html>