[llvm-commits] RuntimeDyLd new features

Kaylor, Andrew andrew.kaylor at intel.com
Thu Feb 23 14:08:20 PST 2012


These changes look good to me.

The test-globall.ll test is failing because your code doesn't handle sections without bits in the object image and needing zero-initialization (at least, that's why it fails on ELF).  I believe that the stubs.ll and test-common-symbols.ll tests are both failing because common symbols aren't being properly handled.

As I mentioned before, I think the common symbol issue requires a bit of restructuring to allow format specific handling of part of the common symbol processing.  I have a fix ready which addresses this for ELF and would provide most of the fix and the necessary interface to fix it on MachO.  Basically, my fix makes a list of common symbols, allocates memory for the common symbols and calls a virtual function to update the symbol address in the SymbolRef.  MachO would just need an implementation of the function to test a symbol to see if it is a common symbol, and an implementation of the function to update the address.

I also have a general fix ready for the nobits/zero-init problem.

Therefore, I think it would be OK to have this patch committed as is, with the understanding that these problems would be fixed shortly afterward.  Again, that is assuming that the changes are otherwise acceptable to Jim.

I would like to discuss the way that the MCJIT-specific tests are handled, but we can talk about that later.

-Andy

From: Danil Malyshev [mailto:dmalyshev at accesssoftek.com]
Sent: Wednesday, February 22, 2012 4:08 PM
To: Jim Grosbach; Kaylor, Andrew; llvm-commits at cs.uiuc.edu
Subject: RE: RuntimeDyLd new features

Hi Jim,

The MCJIT with changed RuntimeDyLd passes most of ExecutionEngine tests on Mac OS, except of 3 tests: stubs.ll, test-common-symbols.ll and test-global.ll.
And I will work to ensure that these three tests are also passed.
Please look 01-RuntimeDyLd-02.patch of the previous letter, can I commit it?



Regards,
Danil


________________________________
From: Kaylor, Andrew [mailto:andrew.kaylor at intel.com]
Sent: Saturday, February 18, 2012 5:41 AM
To: Danil Malyshev; llvm-commits at cs.uiuc.edu
Cc: Jim Grosbach
Subject: RE: RuntimeDyLd new features

Hi Danil,

I'd like to make clear that if Jim is OK with the impact of your patch on MachO JIT loading then I'd be happy for you to proceed with your patch, incorporating my comments below, and we will withdraw our previous uncommitted patch and I will merge the GDB JIT debugging integration code we have with your changes after they have been committed.

Thanks,
Andy


From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Kaylor, Andrew
Sent: Tuesday, February 14, 2012 4:53 PM
To: Danil Malyshev; llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] RuntimeDyLd new features

Hi Danil,

I've been working with your patch also.  Mostly, it looks pretty good.  I've been able to layer most of our added functionality on top of it, and I have GDB integration working and our integration tests passing.  There are a few issues I'd like to bring up, however.

The first thing I saw with your patch is that I had problems loading images that were generated with debug information included.  In particular, applying relocations to debug sections using the actual addresses of the debug sections causes problems.  This is really a small issue because, as you mentioned, debug sections don't need to be loaded.  I worked around this problem by adding a new member function to SectionRef to query whether or not the section is required for execution.  This is fairly trivial for ELF.

A related problem is that your code does not properly handle sections which are only represented by a header in the ELF image (that is, SHT_NOBITS sections) or sections which need to be zero-initialized.  I worked around these problems using the existing isBSS() member in SectionRef, but it would probably be better to split these two concepts.

Similarly, your code does not handle symbols which are marked as relative to the SHN_COMMON section.  Fixing this is a bit more complicated.  I think we can probably build a table of these symbols in the RuntimeDyldImpl class by adding a new member to SectionRef, but we'll need deeper access to the format-specific classes to update the symbol addresses.  I'm working on an implementation of this for ELF.

The code which loads the sections is currently assuming pointer-sized alignment for all sections.  I ran into a case using floating point numbers where 16-byte alignment was required.  This is a fairly simple fix as the alignment value is accessible through SectionRef.

I think it would be worthwhile adding an function to SectionRef to query whether or not a section contains code rather than the current algorithmic method you are using.

In a few places you are using uintptr_t or uint8_t* in places that will restrict cross-architecture JITing.  The TargetAddr in the SectionEntry structure is an example that comes to mind.

The distinction between code and data for memory allocation will be important going forward.  I think it would be better to fix the existing bug that to disable the specific allocations.

In RuntimeDyld::loadObject, in the "parse and process relocations loop" "SectionID == 0" is used to identify sections that need to be looked up, but zero is a valid SectionID.  For that section, it goes into the lookup for every relocation.

I'm continuing to work on integrating our code, including some that hasn't yet been submitted to the list for review, with your changes.  I need to go over what I've done with the rest of my co-workers, but I wanted to give you a progress update now.

-Andy

From: Danil Malyshev [mailto:dmalyshev at accesssoftek.com]
Sent: Tuesday, February 14, 2012 10:30 AM
To: Kaylor, Andrew; llvm-commits at cs.uiuc.edu
Cc: Jim Grosbach
Subject: RE: RuntimeDyLd new features

Hi Andrew,

Thank you for the detailed explanation.
I have carefully studied your patch and DyldELFObject.
A gdb support is very important and it's actually a big step for MCJIT. I like your patch, its loadObject() looks faster than my. But I see one big conceptual problem.

The RuntimeDyLd was developed as a linker, which is able to prepare data for running on another platform. The RuntimeDyldMachO largely implements it.
In the RuntimeDyldELF you went the other way. Now it is very far from remote execution. And even if the object file will be emitted with RTDyldMemoryManager, it is not a good idea: (1) remote execution doesn't need many parts of object file, such as debugging information, headers, section table and etc, (2) some sections is code and must has execution permit, some sections is data without execution permit, so the best way is use a different methods for emits different type of sections.
If your patch will be committed, we obtain a RuntimeDyldMachO and RuntimeDyldELF, which different principles and a different results. And most likely, for ELF will never be fully realization for remote execution.

I think another ways will be better, just for instance, two possible solutions:
1. Always emits the sections required to execution with a RTDyldMemoryManager. Resolve relocation in these sections. And if the isDebugging flag is set, then in addition, DyldELFObject::rebaseObject make his job, except that for the sections stored by the RTDyldMemoryManager write the correct address.
2. Check the debug flag at first, before load object file. If it set, just use your realization of loadObject, otherwise - use normal way with RTDyldMemoryManager.

Both alternatives have their shortcomings, perhaps you find another solution. Also, I would be glad to hear that Jim thinks about it.
Now I trying to merge your patch with my.


Regards,
Danil

________________________________
From: Kaylor, Andrew [mailto:andrew.kaylor at intel.com]
Sent: Thursday, February 09, 2012 4:56 AM
To: Danil Malyshev; llvm-commits at cs.uiuc.edu
Subject: RE: RuntimeDyLd new features

Hi Danil,

Thanks for your efforts in this area.  As Eli Bendersky mentioned, we have a patch out for review in this area also.  I'm hopeful that we can find a convergence between our code, your code and the code that Jim Grosbach has put in place.

Unfortunately, we have been submitting our code in small chunks for ease of review and to keep things stable, and it may not be obvious from what we've put out for review what our intentions were or our intended solutions to the problems that were left open.  I'd like to take this opportunity to discuss the direction we we're heading and see how it might align with what you and Jim have done.

Let me first explain what we have done, and then I'll offer specific comments on your code and possible next steps.

We have had two primary goals: (1) to get MCJIT generated code to work correctly on Intel architecture and help pave the way for other implementations and (2) enable source-level debugging of JITed code with GDB.  This second goal seems to be dropping from view, but it places a few constraints on the eventual implementation.  We've had GDB integration working, BTW.

In order to get GDB to handle JITed code, we need to register an ELF object image through an interface GDB defines.  As you might expect, GDB has some peculiar expectations for what this ELF object image should look like.  In particular, we need to set a flag in the ELF header and update the sh_addr members in the section headers to reflect the address where the section contents reside in memory.

Our most recent patch (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/135997.html, not yet committed) begins by copying the entire ELF image emitted by the MC code generator into an executable buffer.  This was intended as a temporary step toward our eventual solution.  It enabled us to perform relocations in-place on the object and execute functions in place (thus eliminating an extra copy that was previously being done).  We were in the process of implementing a smarter section-based approach, but Jim Grosbach was implementing a similar approach in parallel and our submission ended up appearing out of step in this regard.

So that's our background.  Now, returning to your patch....

I like the idea of combining as much common code as possible into the RuntimeDyldImpl class.  I'm interested to hear from users of the MachO loader if your implementation has lost any of the specialization that they need.  I think it's a promising approach.  There are some ELF-specific details that we will need to have incorporated to re-enable GDB integration, but I expect that we'll be able to find a way to work that in with a few well-placed overloaded function calls.

I have some reservations about the use of the basic ObjectFile interface, which has some serious limitations.  We've been working toward exposing the ELFObjectFile template for use in the runtime loading process (as well as other unrelated uses).  It may be that this is something that can be generalized enough to fit with your approach.  My main concern in this regard is that we need to be able to update specific entries in the ELF image, as described above.

A related issue is that section loading can be refined with some ELF-specific details.  Some sections need to have memory allocated for their contents.  Other sections can be left in place in the originally generated image.  There is a good bit of unnecessary copying going on in the existing implementation, and I'm not clear to what extent your patch addresses that.  Before the object is loaded, it is copied into a new buffer and then the contents of each section are copied again as we go.  What I'd like is for the runtime loaders to use the buffer into which the object is originally generated and only make copies where it is strictly necessary.  This isn't necessarily something you need to do for your work to be acceptable, but I mention it as a likely next step.

Over the next few days I intend to apply your patch locally and try to merge our work into it.  I'll provide additional feedback as I get a better feel for what you've done.

-Andy

From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Danil Malyshev
Sent: Tuesday, February 07, 2012 12:24 PM
To: llvm-commits at cs.uiuc.edu
Subject: [llvm-commits] RuntimeDyLd new features

Hello everyone,

Please review the RuntimeDyLd-01.patch.
This patch makes the following changes:

1. The main works will made in the RuntimeDyLdImpl with uses the ObjectFile class. RuntimeDyLdMachO and RuntimeDyLdELF now only parses relocations and resolve it. This is allows to make improvements of the RuntimeDyLd more easily. In addition the support for COFF can be easily added.

2. Added ARM relocations to RuntimeDyLdELF.

3. Added support for stub functions for the ARM, allowing to do a long branch.

4. Added support for external functions that are not loaded from the object files, but can be loaded from external libraries. Now MCJIT can correctly execute the code containing the printf, putc, and etc.

5. The sections emitted instead functions, thanks Jim Grosbach. MemoryManager.startFunctionBody() and MemoryManager.endFunctionBody() have been removed.

6. MCJITMemoryManager.allocateDataSection() and MCJITMemoryManager. allocateCodeSection() used JMM->allocateSpace() instead of JMM->allocateCodeSection() and JMM->allocateDataSection(), because I got an error: "Cannot allocate an allocated block!" with object file contains more than one code or data sections.

7. Fixed ELF::R_X86_64_PC32 relocation for the case when RealOffset is negative value.

8. Added new testing folder: ExecutionEngine/MCJIT because mcjit tests can be running only for x86 and arm and it's can be filtered with dg.exp.

Tested in Ubuntu x86_64, Ubuntu armv7 and MacOS 64.

Thank you,
Danil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120223/de128ac6/attachment.html>


More information about the llvm-commits mailing list