[LLVMdev] [MCJIT] Multiple GOT handling in RuntimeDyldELF

Mon Jan 19 11:51:51 PST 2015

Hi Keno,

I _think_ that the GOT support we currently have can be made to work if the memory manager provides the necessary help (more on that below), but I will readily admit that it is implemented in a fairly non-standard way that is likely to seem completely wrong on first inspection (and probably still seems at least slightly wrong on second inspection).  It may also have inherent limitations that can’t be overcome without a redesign, but if so I don’t know what those limitations might be.

It may be helpful to refer to the comments in my original GOT implementation patch (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130812/184265.html) when trying to decipher the intent of the existing code as unfortunately I seem to have said quite a bit more there than I did in the actual code comments.

I’m pretty sure that the “multiple GOT” patch was intended to support the case where additional modules are loaded after finalizeLoad() has been called.  It looks like we were at some point trying to use a single GOT for all modules, but once it had been “finalized” another GOT had to be created for subsequent loads.  It’s been a while since I looked at this code, but I believe that we defer calculating the offsets for the GOT until a “finalize” is performed.  This is because the memory for loaded sections may be remapped before that time to handle remote (or out-of-process) execution.  It appears that we are also deferring allocation of the GOT section memory until this time.

With regard to the 2 GB+ offset problem, we’re dependent on the memory manager in that regard.  Even with a single object being loaded there is no guarantee that the memory allocated for the GOT section will be within 2 GB of the memory allocated for other sections unless the memory manager does something to make it so.  An interface was added sometime in the past year (I think) that optionally pre-calculates the amount of memory that will be needed for an object load so that the memory manager can allocate all of this memory as a single block.  I’m not sure this interface properly accounts for the possibility of GOT sections and I don’t know how it works with multiple modules.

The default memory manager attempts to use system address hints to allocate sections in the same region of the address space, but not all OSs support the flags we’d like to use and the address requests are never guaranteed to be respected.  FWIW, Address Sanitizer is very good at exposing issues of this sort.

I should also mention that there is some variation in how GOT-related issues are handled from architecture to architecture within RuntimeDyldELF.  When I implemented the GOT support, I intended for it to be capable of supporting any architecture, but there was some support for GOT-related relocations for non-x86 platforms that pre-dated my GOT implementation and I suspect those will continue to be used as long as they are working correctly.  For instance, several architectures extended the allocated size of code sections and use the extra space at the end of the section to create stubs for PC-relative function calls.

Let me know if there’s anything more I can do to help you get things working.

-Andy

From: Keno Fischer [mailto:kfischer at college.harvard.edu]
Sent: Sunday, January 18, 2015 5:38 AM
To: LLVM Developers Mailing List; Lang Hames; Kaylor, Andrew; Thirumurthi, Ashok
Subject: [MCJIT] Multiple GOT handling in RuntimeDyldELF

Hello everyone,

As part of my quest to add TLS relocation support to MCJIT, I've been taking a closer look at the GOT implementation in RuntimeDyldELF and I believe that is not valid as currently implemented. In particular, I am wondering about the multiple GOT handling support introduced in r192020. If I understand correctly this can make code reuse the GOT table entry in a different object file. This doesn't seem correct to me as there is no guarantee that the loaded object files are allocated within 2GB of each other in memory. What was the intended use case of this feature? Additionally, it seems that currently every access through the GOT get it's own entry, when identical relocations could be combined into one entry. The GOTEntries array is also never cleared, causing memory and performance problems when loading multiple object files (this is a bug and easily fixed, but makes me think this feature isn't particularly well tested). I'm planning to redesign the GOT mechanism, but I would like to understand the use case intended in r192020 first, to make sure I don't design myself into a corner.

Thanks,
Keno
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150119/4e1be077/attachment.html>