[llvm-dev] Proposal: arbitrary relocations in constant global initializers

Eric Christopher via llvm-dev llvm-dev at lists.llvm.org
Tue Oct 18 12:46:44 PDT 2016


To the right list this time.

On Tue, Oct 18, 2016 at 12:43 PM Eric Christopher <echristo at gmail.com>
wrote:

> Hi Peter,
>
> Coming back to his now.
>
>
> IFCC, the previous attempt to teach LLVM to emit jump tables, was removed
> for complicating how functions are emitted, in particular requiring a
> subtarget-specific instruction emitter available in subtarget-independent
> code. However, the form of a jump table entry is generally well known to
>
>
> In general I think we can handle the subtarget specific aspect in the same
> way that we handle module level inline assembly. Anything at that object
> file level needs to be generic enough for the STI we create there anyhow
> and should work for your needs in creating a jump table.
>
> How would you create your jump tables if you were able to generate code in
> this fashion?
>
> Alternately, (though I'm not a huge fan) we could create them using inline
> assembly as a workaround to get this aspect of your code moving forward.
>
> I would very much like to avoid doing things like encoding relocation
> entries into the IR - it seems to be the wrong level to handle that type of
> target specific information. I worry that it will create issues with the
> folk that are trying to move us to a level where we can delete the IR at
> code generation time as well. I've added Jim since I think his team is
> looking into that. We might want an MIR level ability to encode jump
> tables/constants.
>
> Thoughts?
>
> -eric
>
>
> whichever component of the compiler is creating the jump table (for
> example, it
> needs to know the size of each entry, and therefore the specific
> instructions
> used), and we can therefore simplify things greatly by not considering jump
> tables as consisting of instructions, but rather known strings of bytes in
> the .text section with a relocation pointing to the function address. For
> example, on x86:
>
> $ cat tc.ll
> declare void @foo()
>
> define void @bar() {
>   tail call void @foo()
>   ret void
> }
> $ ~/src/llvm-build-rel/bin/llc -filetype=obj -o - tc.ll -O3
> |~/src/llvm-build-rel/bin/llvm-objdump -d -r -
> <stdin>:        file format ELF64-x86-64
>
> Disassembly of section .text:
> bar:
>        0:       e9 00 00 00 00  jmp     0 <bar+5>
>                 0000000000000001:  R_X86_64_PC32        foo-4-P
>
>
>
> Or on ARM:
>
> $ ~/src/llvm-build-rel/bin/llc -filetype=obj -o - tc.ll -O3
> -mtriple=armv7-unknown-linux |~/src/llvm-build-rel/bin/llvm-objdump -d -r -
>
> <stdin>:        file format ELF32-arm-little
>
> Disassembly of section .text:
> bar:
>        0:       fe ff ff ea     b       #-8 <bar>
>                         00000000:  R_ARM_JUMP24 foo
>
>
> How can we represent such jump table entries in IR? One way that almost
> works on x86 is to attach a constant to a function using either prefix data
> or prologue data, or to place a GlobalVariable in the .text section using
> the section attribute. The constant would use ConstantExpr arithmetic to
> produce the required PC32 relocation:
>
> define void @bar() prefix <{ i8, i32, i8, i8, i8 }> <{ i8 -23, i32 trunc
> (i64 add (i64 sub (i64 ptrtoint (void ()* @foo to i64), i64 ptrtoint (void
> ()* @bar to i64)), i64 3) to i32), i8 -52, i8 -52, i8 -52 }> {
>   ...
> }
>
> However, this is awkward, and can’t be used to represent an ARM jump table
> entry. (It also isn’t quite right; PC32 can trigger the creation of a
> PLT entry, which doesn’t entirely match what the ConstantExpr arithmetic
> is doing.)
>
> Design
>
> A relocation can be seen as having three inputs: the relocation type (on
> Mach-O this also includes a pcrel flag), the target, and the addend. So
> let’s define a relocation constant like this:
>
> iNN reloc relocation_type (ptr target, iNN addend)
>
> where iNN is some integer type, and ptr is some pointer type. For example,
> an ARM jump table entry might look like this:
>
> i32 reloc 0x1d (void ()* @foo, i32 0xeafffffe)  ; R_ARM_JUMP24 = 0x1d
>
> There is no error checking for this; if you use the wrong integer type for
> a particular relocation, things will break and you get to keep both pieces.
>
> At the asm level, we would add a single directive, ".reloc", whose syntax
> would look like this when targeting ELF and COFF:
>
> .reloc size relocation_type target addend
>
> or this when targeting Mach-O:
>
> .reloc size relocation_type pcrel target addend
>
> The code generator would emit this directive when emitting a reloc in a
> constant initializer. (Note that this means that reloc constants would only
> be supported with the integrated assembler.)
>
> For example, the ARM JUMP24 relocation would look like this:
>
> .reloc 4 0x1d foo 0xeafffffe
>
> We would need to add some mechanism for the assembler to evaluate
> relocations
> in case the symbol is locally defined and not exported. For that reason,
> we can start with a small set of supported "internal" relocations and
> expand
> as needed.
>
> What about constant propagation?
>
> We do not want reloc constants to appear in functions' IR, or to be
> propagated
> out of global initializers that use them. The simplest solution to this
> problem is to only allow reloc constants in constant initializers where we
> cannot/do not currently perform constant propagation, i.e. function
> prologue
> data, prefix data and constants with weak linkage. This could be enforced
> by the verifier. Later we can consider relaxing this constraint as needed.
>
> Other uses
>
> Relocation constants could be used for other purposes by frontends. For
> example, a frontend may need to represent some other kind of
> custom/specific
> instruction sequence in IR, or to create arbitrary kinds of references
> between
> objects where that may be beneficial (for example, -fsanitize=function may
> use this facility to create GOTOFF relocations in function prologues to
> avoid creating dynamic relocations in the .text section to fix PR17633).
>
> Thanks,
> --
> Peter
>
> [1] http://www.pcc.me.uk/~peter/acad/usenix14.pdf
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161018/f7fcec36/attachment.html>


More information about the llvm-dev mailing list