[LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?

Yaron Keren yaron.keren at gmail.com
Wed Oct 23 02:37:09 PDT 2013


Oops, sorry, I switched the two locations, please read:

lib/Target/X86/X86FrameLowering.cpp  (_chkstk probing) - was required.
lib/Target/X86/X86ISelLowering.cpp - did not change.



2013/10/23 Yaron Keren <yaron.keren at gmail.com>
>
> YES, this is the problem!
>
> The program work ok, even a 5x larger version works well.
>
> Clearly the _chkstk calls must be emitted with ELF target on Windows as
well - why not?
>
> I'd like to make a patch and fix this right.
>
> I experimented with both changes and practically only the
lib/Target/X86/X86ISelLowering.cpp fixes the problem.  The other change
lib/Target/X86/X86FrameLowering.cpp was not required to fix the problem
thus it is probably required for other reasons.
>
> So, should I patch both tests?
> Is the correct patch removing the test isTargetCOFF() completely?
> Or enabling it for both COFF or ELF tarrgets?
> I mean - is there any X86 target that does NOT require this stack
checking?
>
> Yaron
>
>
>
> 2013/10/23 Andrew MacPherson <andrew.macp at gmail.com>
>>
>> Hi Yaron,
>>
>> If you're outputting ELF on Windows this sounds like an issue we ran
into where __chkstk calls weren't being output in the assembly due to an
explicit check for COFF output. Once stack allocations in a given function
exceeded some amount we'd get exactly this kind of crash in the function
initialization.
>>
>> If you take a look for isTargetCOFF() in
lib/Target/X86/X86ISelLowering.cpp and lib/Target/X86/X86FrameLowering.cpp
you should be able to remove that check to force __chkstk output to see if
that helps.
>>
>> Cheers,
>> Andrew
>
>
>
> On Wed, Oct 23, 2013 at 1:22 AM, Yaron Keren <yaron.keren at gmail.com>
wrote:
>>
>> Yes, this is correct code address accessing bad data address.
>>
>> However, there is no other relocation before .text or near it. I'll send
you the full debug printout, maybe you'll note something.
>>
>> The problem could be result of something else entirely else than the
linker such as some library initialization code that by chance worked with
smaller code but fails now.
>>
>> I need to debug and see what's going on. The trouble is no debug
information. Maybe I can do without the source code information and debug
the assembly but without any symbols it's really a challenge to understand
anything. I did try to make MCJIT emit debug info but for some reason
attached gdb did not understand it. Maybe this could be solved.
>>
>> I assumed there may be some limitations around 31-32 bits as there are
various int32 members in the ELF structure, but that's far far away.
Problems start at .text size of about 150K.
>>
>> Yaron
>>
>>
>>
>>
>> 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>
>>>
>>> So it looks like 0x0A3600D1 is a good code address and there’s no
problem executing the code there, but 0x00BC7680 is a bad data address.  Is
that correct?
>>>
>>>
>>>
>>> If so, this is almost certainly a relocation problem.  You just need to
find a relocation that writes an entry (probably a relative offset) at
0x0A3600D1+the size of the instruction at that address.
>>>
>>>
>>>
>>> BTW, what I said before about not being aware of any size limitations
wasn’t quite correct.  If you have enough code and data that we end up
putting sections at addresses that are more than 2GB apart we’ll have
problems, but you should see an assertion in that case.  That can happen if
we weren’t able to get the address we requested from allocateMappedMemory,
but it doesn’t look like that’s what’s happening here.
>>>
>>>
>>>
>>> -Andy
>>>
>>>
>>>
>>> From: Yaron Keren [mailto:yaron.keren at gmail.com]
>>> Sent: Tuesday, October 22, 2013 1:41 PM
>>>
>>>
>>> To: Kaylor, Andrew
>>> Cc: <llvmdev at cs.uiuc.edu>
>>> Subject: Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF
codegen?
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> Thanks for your ideas.
>>>
>>>
>>>
>>> Memory allocation already exceeds 2x64K in the "working" case so it's
not the condition of allocating more than 64K. To be sure I had modified
SectionMemoryManager::allocateSection to allocate four time the required
memory but it did not trigger more crashes.I debugged through the
allocation code including the Win32 code and it seems to work well. I have
also tried disabling the MemGroup.FreeMem cache which did not matter.
>>>
>>>
>>>
>>> An added assert for no Stubs to the end of RuntimeDyldImpl::loadObject
>>>
>>>       processRelocationRef(SectionID, *i, *obj, LocalSections,
LocalSymbols,
>>>
>>>                 Stubs);
>>>
>>>       assert(!Stubs.size());
>>>
>>> indeed caught nothing = no stubs created.
>>>
>>>
>>>
>>> Disabling (de)registerEH did not help.
>>>
>>>
>>>
>>> Looking at relocations and sections printouts, the exception is:
>>>
>>>
>>>
>>> Unhandled exception at 0x0A3600D1 :
>>>
>>> 0xC0000005: Access violation writing location 0x00BC7680.
>>>
>>>
>>>
>>> which is right after the start of .text:
>>>
>>>
>>>
>>> emitSection SectionID: 1 Name: .text obj addr: 0A3F1350 new addr:
0A360000 DataSize: 253203 StubBufSize: 0 Allocate: 253203
>>>
>>> ...
>>>
>>> Resolving relocations Section #1        0A360000
>>>
>>>
>>>
>>> so at least it is running code but tries to write a wrong location.
>>>
>>> Another run exhibits similar crash, still in .text but somewhat later.
>>>
>>>
>>>
>>> I have checked and the function address I'm running is located in .text
towards the end, as expected since it's the last function added to the
Module.
>>>
>>>
>>>
>>> Also I speculated that if it crashes when .text crosses 128K but no, it
happens when it's larger.
>>>
>>>
>>>
>>> I had attached gdb to the process hoping it will show more information
but it showed even less information than the Visual C++ debugger.
>>>
>>>
>>>
>>> Out of ideas...
>>>
>>>
>>>
>>> Yaron
>>>
>>>
>>>
>>>
>>>
>>> 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>
>>>
>>> I would guess that it’s crashing somewhere in the generated code.  On
Windows we don’t have a way to get call stacks to the generated code
(though if you want to try it on Linux, that should work).  You can
probably look at the address where the crash is occurring and verify that
it is in the generated code.
>>>
>>>
>>>
>>> There are a couple of things I would look for.
>>>
>>>
>>>
>>> First, I’d take a look at the SectionMemoryManager allocation handling.
 The fact that the problem is code size dependent strongly points in this
direction.  It may be that SectionMemoryManager does something wrong when
it hits a page boundary or something.
>>>
>>>
>>>
>>> Second, I’d look at the relocation processing.  If it is generating any
stubs, that would be a potential problem spot, but it shouldn’t be
generating any stubs.  So the obvious thing to look at is whether any of
the relocations are writing to the spot where the crash occurs.
>>>
>>>
>>>
>>> -Andy
>>>
>>>
>>>
>>>
>>>
>>> From: Yaron Keren [mailto:yaron.keren at gmail.com]
>>> Sent: Tuesday, October 22, 2013 10:17 AM
>>> To: Kaylor, Andrew
>>> Cc: <llvmdev at cs.uiuc.edu>
>>> Subject: Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF
codegen?
>>>
>>>
>>>
>>> OS is Windows 7 64 bit OS, compiler is 32 bit Visual C++ 2012 with 32
bit.
>>>
>>> The target which is i686-pc-mingw32-elf so I can use the ELF dynamic
loader.
>>>
>>> Code model, relocation model and and memory manager are whatever
default for this - did not modify.
>>>
>>>
>>>
>>> The Module comes from clang. The source is 1000 or more lines repeating
C++ code in one big function:
>>>
>>>
>>>
>>>   A+1;
>>>
>>>   A*B.t();
>>>
>>>
>>>
>>> where A and B are matrices from Armadillo http://arma.sourceforge.net/.
This a stress and performance test due to the large number of EH and
temporary objects created.
>>>
>>>
>>>
>>> I am using the Engine Builder and MCJIT unmodified (except the
multi-modules patches which are not relevant as there is only one module)
like this:
>>>
>>>
>>>
>>>   OwningPtr<llvm::ExecutionEngine> EE(llvm::EngineBuilder(M)
>>>
>>>                                           .setErrorStr(&Error)
>>>
>>>                                           .setUseMCJIT(true)
>>>
>>>                                           .create());
>>>
>>>
>>>
>>> to run the function either
>>>
>>>
>>>
>>>   llvm::Function *F = M->getFunction(Name);
>>>
>>>   void *FN = EE->getPointerToFunction(F);
>>>
>>> or
>>>
>>>   uint64_t FN = EE->getFunctionAddress(Name);
>>>
>>>
>>>
>>> followed by
>>>
>>>
>>>
>>>  ((void (*)())FN)();
>>>
>>> or
>>>
>>>   EE->runFunction(F, std::vector<llvm::GenericValue>());
>>>
>>>
>>>
>>> all work the same with smaller about 1000 lines of the above code
module and crash the same with more code. The call stack is unhelpful
Visual C++ says: Frames below may be incorrect and/or missing which
indicates a real problem with it. I have tried to provide less stack space
(default is 10M) for the compiled program without any change.
>>>
>>>
>>>
>>> Yaron
>>>
>>>
>>>
>>>
>>>
>>> 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>
>>>
>>> I’m not aware of such a limitation.
>>>
>>>
>>>
>>> What architecture, code model and relocation model are you using?  Are
you using the SectionMemoryManager?
>>>
>>>
>>>
>>> -Andy
>>>
>>>
>>>
>>> From: Yaron Keren [mailto:yaron.keren at gmail.com]
>>> Sent: Tuesday, October 22, 2013 8:12 AM
>>> To: <llvmdev at cs.uiuc.edu>; Kaylor, Andrew
>>> Subject: Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
>>>
>>>
>>>
>>> I'm running in MCJIT a module generated from one C++ function. Every
line of the source function uses C++ classes and may throw an exception. As
long as there are less than (about) 1000 lines, everything works. With more
lines the compiled code crashes when running it, with no sensible stack
trace.
>>>
>>>
>>>
>>> Is there any kind of hard-coded size limitation in MCJIT / ELF Dynamic
Linker / ELF codegen / number of EH states in a function ?
>>>
>>>
>>>
>>> I did browse the code but could not find anything obvious.
>>>
>>>
>>>
>>> Yaron
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131023/5fc4b588/attachment.html>


More information about the llvm-dev mailing list