[LLVMdev] load widening conflicts with AddressSanitizer
Duncan Sands
baldrick at free.fr
Tue Jan 24 09:53:34 PST 2012
Hi Kostya,
> On Tue, Jan 24, 2012 at 1:23 AM, Duncan Sands <baldrick at free.fr
> <mailto:baldrick at free.fr>> wrote:
>
> Hi Kostya,
>
> > [resurrecting an old mail thread about AddressSanitizer false positive
> caused by
> > load widening]
> >
> > Once the Attribute::AddressSafety is set by clang (a separate patch), fixing
> > this bug may look as simple as this:
>
> Hi Duncan,
>
> I don't get the point of an attribute. There's plenty of code out there
> that does wide loads like this directly (without them being created by the
> optimizers) since,
>
>
> You mean, the source code that e.g. loads 8 bytes, where up to 7 bytes might be
> out of array bounds?
yes.
> Yes, we've seen quite a bit of such code.
I'm not surprised.
> First, such code often appears in libc, mostly in hand-written assembly (e.g.
> strlen), and valgrind/memcheck has a lot of trouble dealing with it
> (it basically has to intercept all such functions, which does not work when such
> functions are inlined, so valgrind does not properly work with O2-compiled
> binaries).
> asan does not care about it (yet) because it does not instrument libc.
>
> Second, we also seen such hacks in regular C/C++ code (usually, in codecs or
> compression code).
> Strictly speaking -- all these cases are bugs, according to either C or C++
> standard, and asan does not impose more restrictions than the standard.
> Note, that the hacks like these hurt not only address safety checkers like
> asan/memcheck/drmemory/SAFEcode/etc, but also race detectors like
> tsan/helgrind/drd/etc.
>
> We still have lots of code with these intentional OOB accesses and we want to
> test it.
> In most cases I've met so far, the developers decided to actually fix the bugs
> according to the C++ standard and require the memory allocation to have up to 7
> extra bytes.
As far as I can see the C and C++ standards are not relevant. ASAN works on
LLVM IR, not on C or C++. Lots of different languages have LLVM frontends. I
personally turn Ada and Fortran into LLVM IR all the time for example. Clearly
the C standard is not relevant to LLVM IR coming from such languages. What
matters is how LLVM IR is defined. As far as I know this construct is perfectly
valid in LLVM IR.
> I do expect that sometimes this is impossible or undesirable.
> Then the solution would be to use __attribute__((address_safety)) to avoid
> instrumenting the tricky pieces of code.
Unfortunately there is in general no way of attaching such attributes in many
languages.
Ciao, Duncan.
>
> --kcc
>
>
> just like the optimizers, they know it is safe and a win.
> The attribute won't help them. It looks like a way of just hiding the real
> problem, which seems to be that address sanitizer is overly strict.
>
More information about the llvm-dev
mailing list