[LLVMbugs] [Bug 7058] New: non-virtual thunk functions can mangle objects passed by value

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Wed May 5 12:35:32 PDT 2010


           Summary: non-virtual thunk functions can mangle objects passed
                    by value
           Product: clang
           Version: trunk
          Platform: PC
        OS/Version: FreeBSD
            Status: NEW
          Severity: normal
          Priority: P
         Component: C++
        AssignedTo: unassignedclangbugs at nondot.org
        ReportedBy: dimitry at andric.com
                CC: llvmbugs at cs.uiuc.edu, dgregor at apple.com

Created an attachment (id=4831)
 --> (http://llvm.org/bugs/attachment.cgi?id=4831)
Testcase showing a non-virtual thunk mangling an object passed by value

While trying out clang self-hosting on FreeBSD/i386, I encountered a number of
segfaults.  Apparently this was only the case for i386, since a number of
have successfully self-hosted clang on amd64.

Then, when I built an instance of the llvm/clang trunk (r103083) using the
system gcc (v4.2.1), and used that instance to build llvm/clang itself, many
tests in the llvm and clang test suite failed with segfaults too.

I concentrated on one of the particular tests that failed with a segfault:

  ./Debug/bin/opt -gvn <test/Analysis/BasicAA/gcsetest.ll >foo

I found that one of the CallSite objects passed by value was screwed up.  This
happens in one of the llvm::AliasAnalysis::getModRefInfo instances:

include/llvm/Analysis/AliasAnalysis.h, around line 258:

   virtual ModRefResult getModRefInfo(CallSite CS, Value *P, unsigned Size);   
// [0]


   ModRefResult getModRefInfo(CallInst *C, Value *P, unsigned Size) {          
// [1]
     return getModRefInfo(CallSite(C), P, Size);                               
// [2]

which gets called by:

   ModRefResult getModRefInfo(Instruction *I, Value *P, unsigned Size) {       
// [3]
     switch (I->getOpcode()) {
     case Instruction::VAArg:  return getModRefInfo((VAArgInst*)I, P, Size);
     case Instruction::Load:   return getModRefInfo((LoadInst*)I, P, Size);
     case Instruction::Store:  return getModRefInfo((StoreInst*)I, P, Size);
     case Instruction::Call:   return getModRefInfo((CallInst*)I, P, Size);    
// [4]
     case Instruction::Invoke: return getModRefInfo((InvokeInst*)I, P, Size);
     default:                  return NoModRef;

which gets called by MemoryDependenceAnalysis::getPointerDependencyFrom:

MemDepResult MemoryDependenceAnalysis::
getPointerDependencyFrom(Value *MemPtr, uint64_t MemSize, bool isLoad,
                          BasicBlock::iterator ScanIt, BasicBlock *BB) {
     // See if this instruction (e.g. a call or vaarg) mod/ref's the pointer.
     switch (AA->getModRefInfo(Inst, MemPtr, MemSize)) {

Now when AA->getModRefInfo is called, Inst apparently points to an Instruction
object that has an Instruction:Call opcode, so the getModRefInfo function at
casts it to a CallInst pointer at [4], and calls the specialized getModRefInfo
at [1].

So far so good, however, at [2] four different things happen:
1. A temp object CallSite(C) is constructed.
2. It is passed by value (e.g. copied) to the 'virtual' getModRefInfo member
   function at [0].
3. The getModRefInfo member at [0] is *not* directly called, but a non-virtual
   thunk is used, since the AA variable points to a BasicAliasAnalysis object,
   which inherits from multiple classes.
4. The thunk function calls the intended getModRefInfo at [0].

Now, in that step 4, the thunk function screws something up with the CallInst
object.  The non-virtual thunk is visible in the backtrace in gdb (the
assert is triggered because I added on to getInstruction, to check for bogus

#0  0x28cd1927 in kill () from /lib/libc.so.7
#1  0x28cd1886 in raise () from /lib/libc.so.7
#2  0x28cd041a in abort () from /lib/libc.so.7
#3  0x28cb8826 in __assert () from /lib/libc.so.7
#4  0x083e4c37 in llvm::CallSiteBase<llvm::Function, llvm::Value, llvm::User,
llvm::Instruction, llvm::CallInst, llvm::InvokeInst,
llvm::Use*>::getInstruction (this=0xbfbfda68) at CallSite.h:87
#5  0x086f1a17 in (anonymous namespace)::BasicAliasAnalysis::getModRefInfo
(this=0x29104080, P=0x2900948c, Size=4294967295, CS={<llvm::CallSiteBase> = {I
= {Value = 4393129}}, <No data fields>}) at AnalysisWrappers.cpp:166
#6  0x086f2423 in non-virtual thunk to (anonymous
namespace)::BasicAliasAnalysis::getModRefInfo(llvm::CallSite, llvm::Value*,
unsigned int) () at AnalysisWrappers.cpp:166
#7  0x0856d192 in llvm::AliasAnalysis::getModRefInfo (this=0x29104090,
C=0x290094cc, P=0x2900948c, Size=4294967295) at AnalysisWrappers.cpp:166
#8  0x086538ee in llvm::AliasAnalysis::getModRefInfo (this=0x29104090,
I=0x290094cc, P=0x2900948c, Size=4294967295) at AnalysisWrappers.cpp:166
#9  0x0874ccf9 in llvm::MemoryDependenceAnalysis::getPointerDependencyFrom
(this=0x29102080, MemPtr=0x2900948c, MemSize=4294967295, isLoad=true,
BB=0x2900c370, ScanIt={<std::iterator> = {<No data fields>}, NodePtr =
0x290094cc}) at AnalysisWrappers.cpp:166

At #5, you can see the CallSite object, which is represented by CS={}, and the
Value turns out to be 4393129 (0x4308a9).  However, this Value is supposed to
a pointer, and this value is way too low.

When you go up in the stack to #6, and examine the Value, it turns out to be
0x290094ce instead.  In the non-virtual thunk function, it looks like the code
generator inserts an unwanted redirection at [5]:

        .align  16, 0x90
_ZThn16_N12_GLOBAL__N_118BasicAliasAnalysis13getModRefInfoEN4llvm8CallSiteEPNS1_5ValueEj, at function
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ebx
        pushl   %edi
        pushl   %esi
        subl    $36, %esp
        call    .L51$pb
        popl    %eax
        addl    $_GLOBAL_OFFSET_TABLE_+(.Ltmp428-.L51$pb), %eax
        movl    8(%ebp), %ecx
        movl    12(%ebp), %edx
        movl    16(%ebp), %esi
        movl    20(%ebp), %edi
        movl    %ecx, -20(%ebp)
        movl    %esi, -24(%ebp)
        movl    %edi, -28(%ebp)
        movl    -20(%ebp), %ecx
        addl    $-16, %ecx
        leal    -32(%ebp), %esi
        movl    (%edx), %edx                            // [5]
        movl    %edx, (%esi)
        movl    -24(%ebp), %edx
        movl    -28(%ebp), %esi
        movl    %esp, %edi
        movl    %ecx, (%edi)
        movl    -32(%ebp), %ecx
        movl    %ecx, 4(%edi)
        movl    %edx, 8(%edi)
        movl    %esi, 12(%edi)
        movl    %eax, %ebx
        movl    %eax, -16(%ebp)
        movl    -16(%ebp), %eax
        addl    $36, %esp
        popl    %esi
        popl    %edi
        popl    %ebx
        popl    %ebp

If I tell gdb to break at [5], and jump to the next instruction, e.g.  without
performing the indirection, the program completes successfully, and produces
expected output file.

The non-virtual thunk to BasicAliasAnalysis::getModRefInfo is generated by
compiling lib/Analysis/BasicAliasAnalysis.cpp, so I have attempted to reduce
this file to a testcase that is as small as possible.  It is attached as

To reproduce the crash, simply compile the testcase with clang++ and run it, on
either FreeBSD or Linux on i386.  It should crash both with and without

The same code compiles and runs fine with g++, and also with clang on amd64 (I
only tested that on FreeBSD, but I bet it will apply to Linux amd64 too.)

Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the llvm-bugs mailing list