[PATCH] RuntimeDyldMachO should handle RIP-relative relocations	correctly, and the corresponding tests should have coverage for this
    Filip Pizlo 
    fpizlo at apple.com
       
    Tue May 14 12:18:41 PDT 2013
    
    
  
On May 14, 2013, at 11:20 AM, Evan Cheng <evan.cheng at apple.com> wrote:
> Hi Filip,
> 
> Index: lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp
> ===================================================================
> --- lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp	(revision 181678)
> +++ lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp	(working copy)
> @@ -327,6 +327,8 @@
>     Value.SectionID = findOrEmitSection(Obj, Sec, true, ObjSectionToID);
>     uint64_t Addr;
>     Sec.getAddress(Addr);
> +    if (IsPCRel)
> +      Addr -= Offset + 4;
>     Value.Addend = Addend - Addr;
>   }
> 
> This works for current targets. But is 4 the correct value for "crazy future targets"?
Good question!  Short answer: I don't know, I'm just trying to match what other code in RuntimeDyldMachO was doing.
Long answer: The '4' is there because the other PC-relative code assumes that the addend will have an extra 4 added to it.  See the FIXME in RuntimeDyldMachO::resolveX86_64Relocation.  It might be good if I changed my patch to add that FIXME, or a reference to it, in RuntimeDyldMachO::processRelocationRef.
For local references, what happens with my patch, is that we add in the 4 when creating a RelocationEntry in RuntimeDyldMachO::processRelocationRef, and then RuntimeDyldMachO::resolveX86_64Relocation (and friends) will subtract it out.  But for non-local references, the 4 will be there already and as per the FIXME, it appears that we will subtract it out.
I would like to take some time to investigate this more and see what exactly the 4 is there for and whether we need to make that more general, but I'd rather start by getting this fix in, since:
1) It appears to make the code strictly more correct.  It preserves behavior for external references (PC-relative or not).  It preserves behavior for local not-PC-relative references.  And it fixes local PC-relative references: this is demonstrated by the attached test case, and by experimenting I did in WebKit.  Without this patch, local PC-relative references simply don't work - you either get a crash because the offset-from-PC that we scribble into the code ends up pointing into unmapped memory, or you get incorrect execution because because we end up referring to something that is definitely not the constant pool, or global variable, that we wanted.
2) It appears to be consistent with what the code was already trying to do.  The code that uses RelocationEntry clearly assumes that the 4 was added in.  That may be wrong, but it's not new to this patch; this patch just tries to make the creation of RelocationEntry use the same convention (i.e. off-by-4) as the code that consumes RelocationEntry.
I think that what's missing here is just test coverage in MCJITTest for all of the types of relocations that we might deal with.  I'd like to add that coverage, and as I do so, hopefully we'll get more clarity on whether the off-by-4 is needed.
Sound good?
-Filip
> 
> Evan
> 
> On May 12, 2013, at 6:04 PM, Filip Pizlo <fpizlo at apple.com> wrote:
> 
>> The small code model on X86_64 involves using RIP-relative relocations, particularly for the constant pool.
>> 
>> The RuntimeDyldMachO doesn't understand these correctly.  In particular, it doesn't understand what offset immediate will be used in a RIP-relative reference to a local symbol as in:
>> 
>> 	.section	__TEXT,__literal8,8byte_literals
>> 	.align	3
>> LABEL:
>>    .quad <blah>
>> 	.section	__TEXT,__text,regular,pure_instructions
>> # …. bunch of things things
>> addsd LABEL(%rip), %xmm0
>> 
>> In MachO, the "LABEL(%rip)" will be transformed into an offset from the addsd instruction to where the compiler thought the __literal8 section will be placed (plus the offset of the immediate within the __literal8 section, but that offset is zero, in this example).  For example, the compiler may place the __literal8 section right after the __text section, in which case the offset immediate in the addsd instruction will correspond to the size of the __text section, minus the offset of the addsd instruction, and then plus whatever padding/alignment for the __literal8 section.
>> 
>> So, for example, if we then allocated the __literal8 section exactly after the __text section in the MCJIT, then we wouldn't want to change the offset in the addsd instruction.  More broadly, we would want to:
>> 
>> 1) subtract out the offset to the __literal8 section according to the MachO file (i.e. the offset from the referencing instruction - the addsd in this example - to the __literal8 section)
>> 2) add in the true offset to the __literal8 section after both __text and __literal8 have addresses in target memory.
>> 
>> Note that after (1) we will have an offset from the start of  __literal8 to the constant that the code is actually referencing.  Then after (2) we have the correct offset that the code should actually be using.
>> 
>> This allows for the __literal8 section to be placed anywhere we want, and also allows the compiler to convey specific offsets inside of constant pools.
>> 
>> This is different from other relocations; for example if we had used the large code model instead, then we would have a load immediate (for example "movabsq $LABEL, %rax"), where the $LABEL that is encoded in the MachO file will be an offset of the __literal8 section relative to the *start* of the module, rather than relative to the instruction that had the reference.
>> 
>> The RuntimeDyldMachO now only gets this part-way right.  It knows how to handle non-PC-relative relocations just fine.  For PC-relative ones, it knows how to do (2) above, but it doesn't understand how to do (1): it assumes that the code has an offset from the start of the module, rather than an offset from the referencing instruction.
>> 
>> The included patch fixes this bug.
>> 
>> I also include a separate patch that does some stuff to the tests, including adding coverage for small code model on x86_64.  It makes sense for a lot of JIT clients to use the small code model, and I believe it's good for LLVM to have test coverage for this.
>> 
>> <fixrip.patch><fixtests.patch>
>> 
>> -Filip
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130514/ae5a0102/attachment.html>
    
    
More information about the llvm-commits
mailing list