[LLVMdev] load widening conflicts with AddressSanitizer

Fri Dec 16 12:39:52 PST 2011

On Fri, Dec 16, 2011 at 12:37 PM, Eli Friedman <eli.friedman at gmail.com>wrote:

> On Fri, Dec 16, 2011 at 12:19 PM, John Criswell <criswell at illinois.edu>
> wrote:
> > On 12/16/11 12:24 PM, Kostya Serebryany wrote:
> >
> > Hello,
> >
> > We've just got a bug report from Mozilla folks about AddressSanitizer
> false
> > positive with -O2.
> > Turns out there is a conflict between load widening and AddressSanitizer.
> >
> > Simple reproducer:
> >
> > % cat load_widening.c && echo ========= && clang  -O2  -c
>  load_widening.c
> > -flto && llvm-dis load_widening.o && cat load_widening.o.ll
> > void init(char *);
> > int foo() {
> >   char a[22];
> >   init(a);
> >   return a[16]  a[21];
> > }
> > =========
> > ; ModuleID = 'load_widening.o'
> > target datalayout =
> >
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
> > target triple = "x86_64-unknown-linux-gnu"
> >
> > define i32 @foo() nounwind uwtable {
> > entry:
> >   %a = alloca [22 x i8], align 16
> >   %arraydecay = getelementptr inbounds [22 x i8]* %a, i64 0, i64 0
> >   call void @init(i8* %arraydecay) nounwind
> >   %arrayidx = getelementptr inbounds [22 x i8]* %a, i64 0, i64 16
> >   %0 = bitcast i8* %arrayidx to i64*
> >   %1 = load i64* %0, align 16
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> >   %2 = trunc i64 %1 to i32
> >   %sext = shl i32 %2, 24
> >   %conv = ashr exact i32 %sext, 24
> >   %3 = lshr i64 %1, 16
> >   %.tr = trunc i64 %3 to i32
> >   %sext3 = ashr i32 %.tr, 24
> >   %add = add nsw i32 %sext3, %conv
> >   ret i32 %add
> > }
> >
> >
> > Here, the load widening replaces two 1-byte loads with one 8-byte load
> which
> > partially goes out of bounds.
> > Since the array is 16-byte aligned, this transformation should never
> cause
> > problems in regular compilation,
> > but it causes AddressSanitizer false positives because the generated load
> > *is* in fact out of bounds.
> >
> >
> > SAFECode would have the same problem on this code as it now checks for
> loads
> > and stores that "fall off" the beginning or end of a memory object.
> >
> >
> >
> > Do we consider the above transformation legal?
> >
> >
> > I would argue that it should not be legal.  We don't actually know what
> > comes after the 22 byte object.  Is it another memory object?  A
> > memory-mapped I/O device?  Unmapped memory?  Padded junk space?  Reading
> > memory-mapped I/O could have nasty side effects, and accessing unmapped
> > memory could cause the program to fault even though it was written
> correctly
> > as the source-language level.
> >
> > While some may consider these sorts of scenarios to be unlikely, consider
> > the possibility that the alloca is transformed into a global variable or
> > heap allocation.  That would be a legitimate transform and makes the
> above
> > scenarios more likely.
>
> Having the load hit unmapped memory is impossible on common
> architectures given the alignment we're talking about here.  And if
> memory-mapped IO comes after the memory object, the object itself also
> has some sort of unusual semantics, so it should be using volatile
> loads anyway.
>

Would would be the right way to disable load widening when AddressSanitizer
(or SAFECode) is enabled?

--kcc

>
> That said, LLVM isn't actually keeping track of the "page size" (or
> equivalent), so the optimizers can't actually prove this will happen.
>
> -Eli
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111216/e348fa8c/attachment.html>