[LLVMbugs] [Bug 14048] New: Clang emits loads of an alloca wider than the alloca for ARM

Tue Oct 9 23:16:20 PDT 2012

http://llvm.org/bugs/show_bug.cgi?id=14048

             Bug #: 14048
           Summary: Clang emits loads of an alloca wider than the alloca
                    for ARM
           Product: clang
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: LLVM Codegen
        AssignedTo: unassignedclangbugs at nondot.org
        ReportedBy: chandlerc at gmail.com
                CC: llvmbugs at cs.uiuc.edu, mgottesman at apple.com
    Classification: Unclassified

Started tracking this down with much help from Michael as a miscompile with the
new SROA. It looks like the input IR is pretty bogus. This appears to be
somewhat ARM ABI specific. Details follow:

% cat bigfib.reduced.ii
struct S { char c; };
extern void f(S);
void g(const S& s) { f(s); }

% ./bin/clang -cc1 -triple armv7-apple-ios4.0.0 -emit-llvm -w bigfib.reduced.ii
-o -
; ModuleID = 'bigfib.reduced.ii'
target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:64:128-a0:0:64-n32-S64"
target triple = "armv7-apple-ios4.0.0"

%struct.S = type { i8 }

define arm_aapcscc void @_Z1gRK1S(%struct.S* %s) nounwind {
entry:
  %s.addr = alloca %struct.S*, align 4
  %agg.tmp = alloca %struct.S, align 1
  store %struct.S* %s, %struct.S** %s.addr, align 4
  %0 = load %struct.S** %s.addr, align 4
  %1 = bitcast %struct.S* %agg.tmp to i8*
  %2 = bitcast %struct.S* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %1, i8* %2, i32 1, i32 1, i1 false)
  %3 = bitcast %struct.S* %agg.tmp to { [1 x i32] }*
  %4 = getelementptr { [1 x i32] }* %3, i32 0, i32 0
  %5 = load [1 x i32]* %4, align 1
  call arm_aapcscc void @_Z1f1S([1 x i32] %5)
  ret void
}

declare arm_aapcscc void @_Z1f1S([1 x i32])

declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32,
i1) nounwind

The problem here is %5. That loads a 4-byte value from a 1-byte alloca,
%agg.tmp. We get into this mess starting with %3. It appears to be an attempt
to match the load size to the argument type (for some reason this is passed as
a [1 x i32]?? i'm assuming ABI-weirdness here...). That doesn't really fly when
the load is from an alloca that didn't even allocate that much stack. Note that
the alloca isn't even *aligned* on a 4-byte boundary here.

In any event, such out-of-bounds loads trigger ASan failures as well as cause
SROA to drop the load on the floor as meaningless. While we could teach SROA to
somehow cope with such loads (I'm not actually sure how... it's really weird),
that seems to be the wrong way to fix this. The right way seems to be to load
the i8, zext and insertvalue it into a [1 x i32], and then pass that.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.