[LLVMdev] load widening conflicts with AddressSanitizer

Duncan Sands baldrick at free.fr
Tue Jan 24 09:53:34 PST 2012


Hi Kostya,

> On Tue, Jan 24, 2012 at 1:23 AM, Duncan Sands <baldrick at free.fr
> <mailto:baldrick at free.fr>> wrote:
>
>     Hi Kostya,
>
>      > [resurrecting an old mail thread about AddressSanitizer false positive
>     caused by
>      > load widening]
>      >
>      > Once the Attribute::AddressSafety is set by clang (a separate patch), fixing
>      > this bug may look as simple as this:
>
> Hi Duncan,
>
>     I don't get the point of an attribute.  There's plenty of code out there
>     that does wide loads like this directly (without them being created by the
>     optimizers) since,
>
>
> You mean, the source code that e.g. loads 8 bytes, where up to 7 bytes might be
> out of array bounds?

yes.

> Yes, we've seen quite a bit of such code.

I'm not surprised.

> First, such code often appears in libc, mostly in hand-written assembly (e.g.
> strlen), and valgrind/memcheck has a lot of trouble dealing with it
> (it basically has to intercept all such functions, which does not work when such
> functions are inlined, so valgrind does not properly work with O2-compiled
> binaries).
> asan does not care about it (yet) because it does not instrument libc.
>
> Second, we also seen such hacks in regular C/C++ code (usually, in codecs or
> compression code).
> Strictly speaking -- all these cases are bugs, according to either C or C++
> standard, and asan does not impose more restrictions than the standard.
> Note, that the hacks like these hurt not only address safety checkers like
> asan/memcheck/drmemory/SAFEcode/etc, but also race detectors like
> tsan/helgrind/drd/etc.
>
> We still have lots of code with these intentional OOB accesses and we want to
> test it.
> In most cases I've met so far, the developers decided to actually fix the bugs
> according to the C++ standard and require the memory allocation to have up to 7
> extra bytes.

As far as I can see the C and C++ standards are not relevant.  ASAN works on
LLVM IR, not on C or C++.  Lots of different languages have LLVM frontends.  I
personally turn Ada and Fortran into LLVM IR all the time for example.  Clearly
the C standard is not relevant to LLVM IR coming from such languages.  What
matters is how LLVM IR is defined.  As far as I know this construct is perfectly
valid in LLVM IR.

> I do expect that sometimes this is impossible or undesirable.
> Then the solution would be to use __attribute__((address_safety)) to avoid
> instrumenting the tricky pieces of code.

Unfortunately there is in general no way of attaching such attributes in many
languages.

Ciao, Duncan.

>
> --kcc
>
>
>     just like the optimizers, they know it is safe and a win.
>     The attribute won't help them.  It looks like a way of just hiding the real
>     problem, which seems to be that address sanitizer is overly strict.
>




More information about the llvm-dev mailing list