[llvm-commits] AddressSanitizer, first patch

Tue Nov 15 16:45:37 PST 2011

On Tue, Nov 15, 2011 at 4:33 PM, Chris Lattner <clattner at apple.com> wrote:

>
> On Nov 15, 2011, at 4:31 PM, Kostya Serebryany wrote:
>
> Hello,
>
> Here is the fresh AddressSanitizer patch for your consideration.
> This is LLVM-only patch.
> If/when it is submitted, clang changes and the run-time library will
> follow.
> http://codereview.appspot.com/5393042/
>
>
> Hi Kostya,
>
> This can certainly be improved, but I think it is fine to check it into
> mainline.  Please do!
>

Thank you. What shall I do to get the LVVM commit access?

--kcc

>
> -Chris
>
>
> Thanks,
>
> --kcc
>
> On Thu, Sep 15, 2011 at 1:31 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>> Hi,
>>
>> I haven't heard any more comments.
>> Eric Christopher confirmed that the only concern is having the
>> instrumentation pass w/o the run-time library.
>> I do want to contribute run-time as well.
>> How do we proceed?
>>
>> Let me describe the whole thing once more.
>>
>> AddressSanitizer finds
>>   - Out-of-bounds accesses to heap, stack and globals
>>   - Use-after-free (heap)
>> The slowdown is typically within 2x:
>> http://code.google.com/p/address-sanitizer/wiki/PerformanceNumbers
>> It also has an experimental feature to detect use-after-return (stack).
>>
>> The tool currently works on x86 Linux and Mac, both 32- and 64- bits.
>> The Linux port is absolutely stable (one can browse the web using
>> instrumented Chromium for hours).
>> The Mac port is slightly less stable (Chromium tests work, but the
>> browser hiccups infrequently).
>>
>> The tool consists of two parts: instrumentation pass (~1KLOC) and
>> run-time library (~3KLOC).
>>
>> The instrumentation pass being reviewed in this mail thread only supports
>> detection of heap bugs.
>> Other parts (~400 more LOC) will add stack, globals, and blacklisting.
>>
>> The run-time library replaces malloc/free/etc, intercepts other libc
>> functions (e.g. pthread_create) and does error reporting.
>> The library and test code uses google coding conventions.
>> The integration tests use gtest.
>>
>> I'd like to have the run-time available as part of LLVM, and I would
>> agree on any reasonable way of doing so in the following order of
>> preference:
>>   - use svn:externals to point to the existing repository at
>> http://code.google.com/p/address-sanitizer/source/checkout
>>   - copy the code to LLVM repository and periodically take the fresh
>> versions from c.g.c repository.
>>   - use the LLVM repository as the primary repository (here I am
>> concerned with the tool chain: we are too used to the features provided
>> by c.g.c.)
>> Please suggest what suits LLVM the most.
>>
>> Thanks,
>>
>> --kcc
>>
>>
>> On Mon, Sep 12, 2011 at 11:48 AM, Kostya Serebryany <kcc at google.com>wrote:
>>
>>> FYI
>>> I've added noreturn attribute to the __asan_report_error call. It
>>> doesn't cause compile problems in cpu2006 any more.
>>> The good news is that the slowdown (compared to assembly with ud2) on
>>> cpu2006 ranges between 1% and 8%.
>>> And we may recover from it if __asan_report_error does not cause stack
>>> frame to be created in leaf functions.
>>>
>>> --kcc
>>>
>>>
>>> On Fri, Sep 9, 2011 at 12:48 PM, Kostya Serebryany <kcc at google.com>wrote:
>>>
>>>>
>>>>
>>>> On Thu, Sep 8, 2011 at 5:38 PM, Eric Christopher <echristo at apple.com>wrote:
>>>>
>>>>>
>>>>> On Sep 8, 2011, at 5:14 PM, Kostya Serebryany wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 8, 2011 at 4:43 PM, Eric Christopher <echristo at apple.com>wrote:
>>>>>
>>>>>>
>>>>>> On Sep 8, 2011, at 4:35 PM, Chandler Carruth wrote:
>>>>>>
>>>>>> > FYI,
>>>>>> >
>>>>>> > I've looked again at this patch recently, and talked to several
>>>>>> others about it...
>>>>>> >
>>>>>> > I think there are some big issues with the implementation, bit I
>>>>>> think they could probably be addressed. The one that I see the most is the
>>>>>> x86 assembly just being dumped as a blob. I think that really needs to be
>>>>>> resolved before this can move forward.
>>>>>> >
>>>>>> > It seems like this should be do-able in some way with the llvm.trap
>>>>>> intrinsic. If anything, we might need another intrinsic or an extension of
>>>>>> llvm.trap that allows feeding the SIGILL signal handler the data arguments
>>>>>> it needs; I'd need to read up on exactly what is required for that to
>>>>>> figure out how best to represent that in IR, but it shouldn't be that hard.
>>>>>> >
>>>>>> > That said, maybe to improve the discussion of the patch and make
>>>>>> reviewing it easier, you could just switch to emitting the call to the
>>>>>> runtime library function in all cases? That should be a strictly simpler
>>>>>> patch, and then how to improve performance by avoiding the runtime call can
>>>>>> be a followup.
>>>>>> >
>>>>>>
>>>>>> This encompasses most of my concerns with the patch.
>>>>>>
>>>>>> > Maybe others could comment on other implementation issues? I'm
>>>>>> sadly not an expert at LLVM...
>>>>>>
>>>>>> In a glance I didn't see a lot, but the other bits were somewhat
>>>>>> hiding it. How about when we get a new patch we can take a bit more of a
>>>>>> look at it.
>>>>>>
>>>>>
>>>>> I dropped the assembly part and updated the patch at
>>>>> http://codereview.appspot.com/4867059/ (also in attachment).
>>>>> We can figure out how to make it more efficient later (something
>>>>> like llvm.trap intrinsic with a parameter).
>>>>>
>>>>> This patch is 570 LOC plus few lines of trivial changes and a test.
>>>>> It implements only basic functionality: detection of out-of-bounds and
>>>>> use-after-free in heap.
>>>>>
>>>>> The complete version of LLVM pass is ~1000 LOC (
>>>>> http://code.google.com/p/address-sanitizer/source/browse/trunk/llvm/AddressSanitizer.cpp
>>>>> ).
>>>>> It also handles out-of-bounds for stack and global objects and
>>>>> (recently added) stack use-after-return bugs.
>>>>>
>>>>>
>>>>> Some somewhat trivial issues:
>>>>>
>>>>> a) Please go ahead and hide the non-user visible flags.
>>>>>
>>>> done
>>>>
>>>>>
>>>>> b) Not a big fan of this style of avoid 80-col wrap:
>>>>>
>>>>> +Instruction *AddressSanitizer::generateCrashCode(
>>>>> +    IRBuilder<> &IRB, Value *Addr, int TelltaleValue) {
>>>>>
>>>> done
>>>>
>>>>>
>>>>> c) +  // which will be lowered to a cusom assembly.
>>>>>
>>>>> s/cusom/custom
>>>>>
>>>>> d)
>>>>>
>>>>> +  if (TypeSize != 8 && TypeSize != 16 &&
>>>>> +      TypeSize != 32 && TypeSize != 64 && TypeSize != 128) {
>>>>> +    // TODO(kcc): do something better.
>>>>> +    return;
>>>>> +  }
>>>>>
>>>>> What's the reason behind this?
>>>>>
>>>>
>>>> Changed the comment.
>>>>
>>>>
>>>>>
>>>>> Perhaps not so trivial:
>>>>>
>>>>> a) Where are the asan_* functions? Where are they supposed to live?
>>>>>
>>>>> I'll probably have more, but it's a start and the discussion on the
>>>>> library functions probably should happen sooner rather than later.
>>>>>
>>>>> -eric
>>>>>
>>>>
>>>>
>>>
>>
> <issue5393042_3001.diff>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111115/f3c56f18/attachment.html>