[LLVMdev] load widening conflicts with AddressSanitizer

Kostya Serebryany kcc at google.com
Tue Jan 24 10:06:02 PST 2012


On Tue, Jan 24, 2012 at 9:53 AM, Duncan Sands <baldrick at free.fr> wrote:

> Hi Kostya,
>
>  On Tue, Jan 24, 2012 at 1:23 AM, Duncan Sands <baldrick at free.fr
>> <mailto:baldrick at free.fr>> wrote:
>>
>>    Hi Kostya,
>>
>>     > [resurrecting an old mail thread about AddressSanitizer false
>> positive
>>    caused by
>>     > load widening]
>>     >
>>     > Once the Attribute::AddressSafety is set by clang (a separate
>> patch), fixing
>>     > this bug may look as simple as this:
>>
>> Hi Duncan,
>>
>>    I don't get the point of an attribute.  There's plenty of code out
>> there
>>    that does wide loads like this directly (without them being created by
>> the
>>    optimizers) since,
>>
>>
>> You mean, the source code that e.g. loads 8 bytes, where up to 7 bytes
>> might be
>> out of array bounds?
>>
>
> yes.
>
>
>  Yes, we've seen quite a bit of such code.
>>
>
> I'm not surprised.
>
>
>  First, such code often appears in libc, mostly in hand-written assembly
>> (e.g.
>> strlen), and valgrind/memcheck has a lot of trouble dealing with it
>> (it basically has to intercept all such functions, which does not work
>> when such
>> functions are inlined, so valgrind does not properly work with O2-compiled
>> binaries).
>> asan does not care about it (yet) because it does not instrument libc.
>>
>> Second, we also seen such hacks in regular C/C++ code (usually, in codecs
>> or
>> compression code).
>> Strictly speaking -- all these cases are bugs, according to either C or
>> C++
>> standard, and asan does not impose more restrictions than the standard.
>> Note, that the hacks like these hurt not only address safety checkers like
>> asan/memcheck/drmemory/**SAFEcode/etc, but also race detectors like
>> tsan/helgrind/drd/etc.
>>
>> We still have lots of code with these intentional OOB accesses and we
>> want to
>> test it.
>> In most cases I've met so far, the developers decided to actually fix the
>> bugs
>> according to the C++ standard and require the memory allocation to have
>> up to 7
>> extra bytes.
>>
>
> As far as I can see the C and C++ standards are not relevant.  ASAN works
> on
> LLVM IR, not on C or C++.  Lots of different languages have LLVM
> frontends.  I
> personally turn Ada and Fortran into LLVM IR all the time for example.
>  Clearly
> the C standard is not relevant to LLVM IR coming from such languages.  What
> matters is how LLVM IR is defined.  As far as I know this construct is
> perfectly
> valid in LLVM IR.


Asan will not work for Fortran and Ada anyway (at least, out of the box).
I am not even sure that anything like asan is needed for Ada (it has bounds
checking built-in, the dynamic memory allocation is much more restrictive).
The tool is rather specific to C/C++ (and ObjectiveC probably, although we
have almost no tests for ObjectiveC, nor much knowledge in it).
Yes, the IR transformations are done on the LLVM level, but the asan
run-time library heavily depends on the C/C++ semantics and even
implementation,
and you can't really separate the asan instrumentation pass from the
run-time.

--kcc





>
>
>  I do expect that sometimes this is impossible or undesirable.
>> Then the solution would be to use __attribute__((address_safety)**) to
>> avoid
>> instrumenting the tricky pieces of code.
>>
>
> Unfortunately there is in general no way of attaching such attributes in
> many
> languages.
>
> Ciao, Duncan.
>
>
>
>> --kcc
>>
>>
>>    just like the optimizers, they know it is safe and a win.
>>    The attribute won't help them.  It looks like a way of just hiding the
>> real
>>    problem, which seems to be that address sanitizer is overly strict.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120124/f1ecdd47/attachment.html>


More information about the llvm-dev mailing list