[llvm-dev] MASM & RIP-relative addressing

Eric Astor via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 21 14:43:30 PST 2020


Clarifying a minor copy/paste error, ml64.exe actually outputs:

       0:       8b 05 00 00 00 00       mov     eax, dword ptr [rip]
                0000000000000002:  IMAGE_REL_AMD64_REL32        foo

In other words, the relocation info is the same... but the instruction uses
RIP-relative addressing, not absolute.

On Tue, Jan 21, 2020 at 5:41 PM Eric Astor <epastor at google.com> wrote:

> Apologies - I apparently remembered part of the issue incorrectly, so this
> ended up quite confusing. The problem comes when referencing labels in a
> different section of the binary. To clarify, if I assemble the code:
>
> .data
> foo BYTE 5
> .code
> mov eax, foo
>
> with Microsoft's ml64.exe, it emits an object file disassembling to:
>
>        0:       8b 05 00 00 00 00       mov     eax, dword ptr [rip]
>                 000000000000000b:  IMAGE_REL_AMD64_REL32        foo
>
> On the other hand, if I use my current local draft of llvm-ml, I get a
> different result. I actually get the same result as I do for llvm-mc, using
> the corresponding code:
>
> .data
> foo:
> .byte 5
> .text
> .intel_syntax
> mov eax, foo
>
> Either way, LLVM emits an object file with disassembly (and relocation) as
> follows:
>
>        0:       8b 04 25 00 00 00 00    mov     eax, dword ptr [0]
>                 0000000000000003:  IMAGE_REL_AMD64_ADDR32       foo
>
> To replicate the results from ml64.exe with LLVM, I instead need to use
>
> mov eax, [foo + rip]
>
> in place of mov eax, foo. At least when building with llvm-ml, we need to
> mimic ml.exe's approach; a reference to a symbol in another section should
> use the relative addressing mode.
>
> My first attempt to fix this was very clumsy - when in MASM mode, I forced
> all expressions without a base register to presume RIP. Unfortunately, that
> breaks any attempt to use "jcc", since it turns label references into
> absolute memory references with a base register (and the "jcc" family
> doesn't accept absolute memory operands). Any suggestions for how I can fix
> the issue described here without breaking "jcc"?
>
> On Tue, Jan 21, 2020 at 3:43 PM Eli Friedman <efriedma at quicinc.com> wrote:
>
>> All immediate jump instructions on x86 (call/jmp/jcc) have a relative
>> offset operand.  The destination is, in some sense, “rip-relative”, but we
>> don’t represent it like that in LLVM.  If you look at the TableGen
>> descriptions, jumps use brtarget32, and calls use i32imm_pcrel.  In both
>> Microsoft and GNU assembly syntax, this is something like “call baz”.
>>
>>
>>
>> “call”/”jmp” also have a register/memory form, for indirect calls.  In
>> 64-bit, this allows rip-relative references, to call a function pointer
>> stored in a global variable.  In Microsoft assembly syntax, this is “call
>> QWORD PTR baz”. In GNU assembly syntax, this is “call *baz(%rip)”.
>>
>>
>>
>> For 64-bit x86, any reference to a global has to be a rip-relative
>> address (since all 64-bit programs are position-independent), but on 32-bit
>> x86, it’s also possible to refer to the address of a variable using
>> something like “add eax, OFFSET baz”.
>>
>>
>>
>> For globals which are explicitly labeled “PTR” or “OFFSET”, the correct
>> representation should be unambiguous, and it should be easy to print
>> appropriate error messages.  For other cases, I’m not sure what the
>> inference rules are.  It might vary depending on the opcode.
>>
>>
>>
>> -Eli
>>
>>
>>
>> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Eric
>> Astor via llvm-dev
>> *Sent:* Monday, January 20, 2020 6:26 PM
>> *To:* LLVM-dev <llvm-dev at lists.llvm.org>
>> *Subject:* [EXT] [llvm-dev] MASM & RIP-relative addressing
>>
>>
>>
>> Hi all,
>>
>>
>>
>> Continuing work on llvm-ml (a MASM assembler)... and my latest obstacle
>> is in enabling MASM's convention that (unless specified) all memory
>> location references should be RIP-relative. Without it, we emit the wrong
>> instructions for "call", "jmp", etc., and anything we build fails at the
>> linking stage.
>>
>>
>>
>> My best attempt at this so far is a small patch to X86AsmParser.cpp -
>> just taking any Intel expression with no specified base register and
>> switching it to use RIP - and this works alright. There's at least one
>> exception: it breaks the "jcc" instructions, at least "jcc <label>". The
>> issue seems to be that the "jcc" family exclusively takes a relative
>> offset, never an absolute reference... so adding a base register causes the
>> operand not to match. ("jcc" is always RIP-relative anyway.)
>>
>>
>>
>> I'm not very familiar with the operand-matching logic, and am still
>> pretty new to LLVM as a whole. Are there more X86 instructions this will
>> interact badly with? Any thoughts on how this could be handled better?
>>
>>
>>
>> If this is mostly a valid approach, might there be a way to change the
>> operand type of "jcc" to accept offset(base) operands, as long as base ==
>> X86::RIP, then ignore the RIP bit?
>>
>>
>>
>> Thanks,
>>
>> - Eric
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200121/a0c34491/attachment-0001.html>


More information about the llvm-dev mailing list