[LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?

Andrew MacPherson andrew.macp at gmail.com
Wed Oct 23 00:32:47 PDT 2013


Hi Yaron,

If you're outputting ELF on Windows this sounds like an issue we ran into
where __chkstk calls weren't being output in the assembly due to an
explicit check for COFF output. Once stack allocations in a given function
exceeded some amount we'd get exactly this kind of crash in the function
initialization.

If you take a look for isTargetCOFF() in lib/Target/X86/X86ISelLowering.cpp
and lib/Target/X86/X86FrameLowering.cpp you should be able to remove that
check to force __chkstk output to see if that helps.

Cheers,
Andrew


On Wed, Oct 23, 2013 at 1:22 AM, Yaron Keren <yaron.keren at gmail.com> wrote:

> Yes, this is correct code address accessing bad data address.
>
> However, there is no other relocation before .text or near it. I'll send
> you the full debug printout, maybe you'll note something.
>
> The problem could be result of something else entirely else than the
> linker such as some library initialization code that by chance worked with
> smaller code but fails now.
>
> I need to debug and see what's going on. The trouble is no debug
> information. Maybe I can do without the source code information and debug
> the assembly but without any symbols it's really a challenge to understand
> anything. I did try to make MCJIT emit debug info but for some reason
> attached gdb did not understand it. Maybe this could be solved.
>
> I assumed there may be some limitations around 31-32 bits as there are
> various int32 members in the ELF structure, but that's far far away.
> Problems start at .text size of about 150K.
>
> Yaron
>
>
>
>
> 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>
>
>>  So it looks like 0x0A3600D1 is a good code address and there’s no
>> problem executing the code there, but 0x00BC7680 is a bad data address.  Is
>> that correct?****
>>
>> ** **
>>
>> If so, this is almost certainly a relocation problem.  You just need to
>> find a relocation that writes an entry (probably a relative offset) at
>> 0x0A3600D1+the size of the instruction at that address.****
>>
>> ** **
>>
>> BTW, what I said before about not being aware of any size limitations
>> wasn’t quite correct.  If you have enough code and data that we end up
>> putting sections at addresses that are more than 2GB apart we’ll have
>> problems, but you should see an assertion in that case.  That can happen if
>> we weren’t able to get the address we requested from allocateMappedMemory,
>> but it doesn’t look like that’s what’s happening here.****
>>
>> ** **
>>
>> -Andy****
>>
>> ** **
>>
>> *From:* Yaron Keren [mailto:yaron.keren at gmail.com]
>> *Sent:* Tuesday, October 22, 2013 1:41 PM
>>
>> *To:* Kaylor, Andrew
>> *Cc:* <llvmdev at cs.uiuc.edu>
>> *Subject:* Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF
>> codegen?****
>>
>> ** **
>>
>> Hi,****
>>
>> ** **
>>
>> Thanks for your ideas.****
>>
>> ** **
>>
>> Memory allocation already exceeds 2x64K in the "working" case so it's not
>> the condition of allocating more than 64K. To be sure I had modified
>> SectionMemoryManager::allocateSection to allocate four time the required
>> memory but it did not trigger more crashes.I debugged through the
>> allocation code including the Win32 code and it seems to work well. I have
>> also tried disabling the MemGroup.FreeMem cache which did not matter.****
>>
>> ** **
>>
>> An added assert for no Stubs to the end of RuntimeDyldImpl::loadObject***
>> *
>>
>>       processRelocationRef(SectionID, *i, *obj, LocalSections,
>> LocalSymbols,****
>>
>>                 Stubs);****
>>
>>       assert(!Stubs.size());****
>>
>> indeed caught nothing = no stubs created.****
>>
>> ** **
>>
>> Disabling (de)registerEH did not help.****
>>
>> ** **
>>
>> Looking at relocations and sections printouts, the exception is:****
>>
>> ** **
>>
>> Unhandled exception at 0x0A3600D1 :****
>>
>> 0xC0000005: Access violation writing location 0x00BC7680.****
>>
>> ** **
>>
>> which is right after the start of .text:****
>>
>> ** **
>>
>> emitSection SectionID: 1 Name: .text obj addr: 0A3F1350 new addr:
>> 0A360000 DataSize: 253203 StubBufSize: 0 Allocate: 253203****
>>
>> ...****
>>
>> Resolving relocations Section #1        0A360000****
>>
>> ** **
>>
>> so at least it is running code but tries to write a wrong location.****
>>
>> Another run exhibits similar crash, still in .text but somewhat later.***
>> *
>>
>> ** **
>>
>> I have checked and the function address I'm running is located in .text
>> towards the end, as expected since it's the last function added to the
>> Module.****
>>
>> ** **
>>
>> Also I speculated that if it crashes when .text crosses 128K but no, it
>> happens when it's larger.****
>>
>> ** **
>>
>> I had attached gdb to the process hoping it will show more information
>> but it showed even less information than the Visual C++ debugger.****
>>
>> ** **
>>
>> Out of ideas... ****
>>
>> ** **
>>
>> Yaron****
>>
>> ** **
>>
>> ** **
>>
>> 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>****
>>
>>  I would guess that it’s crashing somewhere in the generated code.  On
>> Windows we don’t have a way to get call stacks to the generated code
>> (though if you want to try it on Linux, that should work).  You can
>> probably look at the address where the crash is occurring and verify that
>> it is in the generated code.****
>>
>>  ****
>>
>> There are a couple of things I would look for.****
>>
>>  ****
>>
>> First, I’d take a look at the SectionMemoryManager allocation handling.
>> The fact that the problem is code size dependent strongly points in this
>> direction.  It may be that SectionMemoryManager does something wrong when
>> it hits a page boundary or something.****
>>
>>  ****
>>
>> Second, I’d look at the relocation processing.  If it is generating any
>> stubs, that would be a potential problem spot, but it shouldn’t be
>> generating any stubs.  So the obvious thing to look at is whether any of
>> the relocations are writing to the spot where the crash occurs.****
>>
>>  ****
>>
>> -Andy****
>>
>>  ****
>>
>>  ****
>>
>> *From:* Yaron Keren [mailto:yaron.keren at gmail.com]
>> *Sent:* Tuesday, October 22, 2013 10:17 AM
>> *To:* Kaylor, Andrew
>> *Cc:* <llvmdev at cs.uiuc.edu>
>> *Subject:* Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF
>> codegen?****
>>
>>  ****
>>
>> OS is Windows 7 64 bit OS, compiler is 32 bit Visual C++ 2012 with 32 bit.
>> ****
>>
>> The target which is i686-pc-mingw32-elf so I can use the ELF dynamic
>> loader. ****
>>
>> Code model, relocation model and and memory manager are whatever default
>> for this - did not modify.****
>>
>>  ****
>>
>> The Module comes from clang. The source is 1000 or more lines repeating
>> C++ code in one big function:****
>>
>>  ****
>>
>>   A+1;****
>>
>>   A*B.t();****
>>
>>  ****
>>
>> where A and B are matrices from Armadillo http://arma.sourceforge.net/.
>> This a stress and performance test due to the large number of EH and
>> temporary objects created.****
>>
>>  ****
>>
>> I am using the Engine Builder and MCJIT unmodified (except the
>> multi-modules patches which are not relevant as there is only one module)
>> like this:****
>>
>>  ****
>>
>>   OwningPtr<llvm::ExecutionEngine> EE(llvm::EngineBuilder(M)****
>>
>>                                           .setErrorStr(&Error)****
>>
>>                                           .setUseMCJIT(true)****
>>
>>                                           .create());****
>>
>>  ****
>>
>> to run the function either ****
>>
>>  ****
>>
>>   llvm::Function *F = M->getFunction(Name);****
>>
>>   void *FN = EE->getPointerToFunction(F);****
>>
>> or****
>>
>>   uint64_t FN = EE->getFunctionAddress(Name);****
>>
>>  ****
>>
>> followed by ****
>>
>>  ****
>>
>>  ((void (*)())FN)();****
>>
>> or****
>>
>>   EE->runFunction(F, std::vector<llvm::GenericValue>());****
>>
>>  ****
>>
>> all work the same with smaller about 1000 lines of the above code module
>> and crash the same with more code. The call stack is unhelpful Visual C++
>> says: Frames below may be incorrect and/or missing which indicates a real
>> problem with it. I have tried to provide less stack space (default is 10M)
>> for the compiled program without any change.****
>>
>>  ****
>>
>> Yaron****
>>
>>  ****
>>
>>  ****
>>
>> 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>****
>>
>>   I’m not aware of such a limitation.****
>>
>>  ****
>>
>> What architecture, code model and relocation model are you using?  Are
>> you using the SectionMemoryManager?****
>>
>>  ****
>>
>> -Andy****
>>
>>  ****
>>
>> *From**:* Yaron Keren [mailto:yaron.keren at gmail.com]
>> *Sent**:* Tuesday, October 22, 2013 8:12 AM
>> *To**:* <llvmdev at cs.uiuc.edu>; Kaylor, Andrew
>> *Subject**:* Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
>> ****
>>
>>  ****
>>
>> I'm running in MCJIT a module generated from one C++ function. Every
>> line of the source function uses C++ classes and may throw an exception.
>> As long as there are less than (about) 1000 lines, everything works. With
>> more lines the compiled code crashes when running it, with no sensible
>> stack trace.****
>>
>>  ****
>>
>> Is there any kind of hard-coded size limitation in MCJIT / ELF Dynamic
>> Linker / ELF codegen / number of EH states in a function ? ****
>>
>>  ****
>>
>> I did browse the code but could not find anything obvious. ****
>>
>>  ****
>>
>> Yaron****
>>
>>  ****
>>
>>   ****
>>
>>  ** **
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131023/6266767a/attachment.html>


More information about the llvm-dev mailing list