[PATCH] D69434: WIP: ExecutionEngine: add preliminary support for COFF ARM64

Tue Oct 29 14:43:01 PDT 2019

mstorsjo added inline comments.

================
Comment at: lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h:169
+      // therefore implicit (<< 12).
+      Addend = ((*p & 0x60000000) >> 29) | ((*p & 0x01FFFFE0) >> 3) << 12;
+      Addend = SignExtend64(Addend, 33);
----------------
kaadam wrote:
> mstorsjo wrote:
> > No, this is wrong - and this is one of the less obvious details.
> > 
> > If you have an addend stored in the instruction pointed to by `IMAGE_REL_ARM64_PAGEBASE_REL21`, the addend is expressed in bytes, not in 4096 byte pages.
> > 
> > Consider you have a symbol close to the end of a page, and you want to express an offset by a few bytes (less than a page), that makes the pointed to location in another page.
> > 
> > If the addend would express a number of pages (as this patch expects right now), the addend here would be zero, and you'd end up with this part of the instruction pair pointing at the wrong page.
> > 
> > Therefore, the immediate stored in the instruction before handling relocation is expressed as a number of bytes, even though it means a number of pages after the relocation is done and the instruction is executed.
> Thanks for explaining this. It seems I misunderstood this behavior. To be honest I looked at how MachOAArch64 does, how they decode the addend, I compared to aarch64 reference manual to see how it works, it seemed to me I can do the same thing as MachO does.
I guess it differs a bit between how the different object file formats encode the relocations and symbol offsets. (IIRC ELF and MachO can say that a relocation is relative to a symbol with offset, while COFF only points at a symbol, and any offset must be applied via the instruction immediates.)

================
Comment at: lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h:188
+      int ImplicitShift = 0;
+      if ((*p & 0x3B000000) == 0x39000000) { // << load / store
+        // For load / store instructions the size is encoded in bits 31:30.
----------------
kaadam wrote:
> mstorsjo wrote:
> > I guess this check could also be for the relocation type `IMAGE_REL_ARM64_PAGEOFFSET_12L`?
> It is just for IMAGE_REL_ARM64_PAGEOFFSET_12L, we need to determine the shift value only for load/store, since for instructions ADD/ADDS (immediate) we should use zero shift.
Yes, but the relocation names already imply this. The `L` suffixed relocation is used for loads/stores, and the `A` suffixed relocation is used for add instructions. For `IMAGE_REL_ARM64_PAGEOFFSET_12L` we do not need to check whether the instruction is a load/store, but we can go directly to reading out the implicit shift amount, and for `IMAGE_REL_ARM64_PAGEOFFSET_12A` we should not read any implicit shift amount at all.

See lld/COFF/Chunks.cpp, SectionChunk::applyRelARM64. For IMAGE_REL_ARM64_PAGEOFFSET_12L we call applyArm64Ldr which reads out the shift amount and then calls applyArm64Imm, while applyArm64Imm is called directly for IMAGE_REL_ARM64_PAGEOFFSET_12A.

In general, when a linker (either dynamic or static) resolves a relocation, it should seldom need to inspect the instruction it is applied on, even though it is needed here for reading out the implicit shift amount. In general, the relocation type just encodes a specific action that should be done on that memory location with very little extra logic.

================
Comment at: lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h:243
+      case COFF::IMAGE_REL_ARM64_ADDR64: {
+        or32le(Target + 12, ((Value + RE.Addend) & 0xFFFF) << 5);
+        or32le(Target + 8, ((Value + RE.Addend) & 0xFFFF0000) >> 11);
----------------
kaadam wrote:
> mstorsjo wrote:
> > This doesn't seem right. `IMAGE_REL_ARM64_ADDR64` is a plain 64 bit integer (just like `IMAGE_REL_ARM64_ADDR32NB`), not a series of instructions that should get immediates added.
> Yes, you're right. Currently I handle a long branch instruction with this relocation, so when I needed to create stub function (which generates movz/movk instruction), I  just set this relocation to detect this case when we have an external symbol. When I tested it with a small examples and Swithshader It worked because I haven't got this relocation type. Of course I should use another one, maybe an internal type.
> What do you think?
Hmm, as I'm not familiar with those bits that generate it, I don't know for sure. As you're already setting a small code model for aarch64/win, doesn't that already achieve this?

Otherwise some variant of adrp+add would normally be used for forming any arbitrary address, instead of a series of mov instructions, unless the value to be formed is a constant.

If necessary I guess one could consider using private internal relocation types, but I'd at least defer those bits to a later patch where it can be discussed properly on its own, and keep this first for the official relocation types.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69434/new/

https://reviews.llvm.org/D69434