[PATCH] D34409: Use 64bit jump table with large code model on 64bit

Wed Jun 21 07:20:10 PDT 2017

joerg added a comment.

In https://reviews.llvm.org/D34409#786482, @yuyichao wrote:

> > If the jump table is writable
>
> FWIW, that also sounds like a hack and a (minor?) security issue.

It's no better or worse than the GOT.

>> It would be really better to work on the real fix and not add more hacks...
> 
> And what exactly do you mean by "the real fix". I believe this is a reasonable generic fallback before an optimization is implemented for a particular arch and given that this is exactly what GCC does on x86-64 I think this is (one of) the correct solution on x86-64 too.
>  On AArch64, GCC throws an error, which I think is much better than silently generating the wrong code....

There are three options given here so far:
(1) Use the plain block address. Requires replacing the default getSectionForJumpTable and getJumpTableEncoding. Change is localized to the affected architectures.
(2) Introduce a whole new generic 64bit label difference. Non-localized infrastructure change.
(3) Properly switch to function-relative 32bit labels. Change is localized the affected architecture or at least support glue.

In terms of code overhead for the access, (1) is strictly the shortest, plain indirect branch to a indexed memory location. (2) needs a pointer load + offset computation, (3) needs a pointer load + PC-relative offset computation. As such, (2) and (3) are often somewhat equal. (1) and (2) require the same amount of memory for the jump table, making (2) not very attractive when relocations themselve are ephemeral. (3) saves significant amounts of space for any non-trivial jump table.

Note that GCC is quite different as it often will not create a separate jump table section. That's also an option [(4)] supported by LLVM with some overrides and it will work for large code model at the expensive of making more static data executable.

Based on all that, I do not consider the complexity of (2) justified at all for a short term workaround of target-specific limitations. (1) and (4) are easier and create faster code. (3) is the preferred implementation for 64bit platforms as it minimizes size of executable code and total binary size at the expensive of a slightly more complex access vector. The only reason why it isn't implemented for AArch64 and X86_64 yet is the necessary function-specific base address.

https://reviews.llvm.org/D34409