[PATCH] D69434: WIP: ExecutionEngine: add preliminary support for COFF ARM64

Tue Nov 5 09:42:47 PST 2019

kaadam marked an inline comment as done.
kaadam added inline comments.

================
Comment at: lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFAArch64.h:169
+      // therefore implicit (<< 12).
+      Addend = ((*p & 0x60000000) >> 29) | ((*p & 0x01FFFFE0) >> 3) << 12;
+      Addend = SignExtend64(Addend, 33);
----------------
mstorsjo wrote:
> kaadam wrote:
> > mstorsjo wrote:
> > > kaadam wrote:
> > > > mstorsjo wrote:
> > > > > No, this is wrong - and this is one of the less obvious details.
> > > > > 
> > > > > If you have an addend stored in the instruction pointed to by `IMAGE_REL_ARM64_PAGEBASE_REL21`, the addend is expressed in bytes, not in 4096 byte pages.
> > > > > 
> > > > > Consider you have a symbol close to the end of a page, and you want to express an offset by a few bytes (less than a page), that makes the pointed to location in another page.
> > > > > 
> > > > > If the addend would express a number of pages (as this patch expects right now), the addend here would be zero, and you'd end up with this part of the instruction pair pointing at the wrong page.
> > > > > 
> > > > > Therefore, the immediate stored in the instruction before handling relocation is expressed as a number of bytes, even though it means a number of pages after the relocation is done and the instruction is executed.
> > > > Thanks for explaining this. It seems I misunderstood this behavior. To be honest I looked at how MachOAArch64 does, how they decode the addend, I compared to aarch64 reference manual to see how it works, it seemed to me I can do the same thing as MachO does.
> > > I guess it differs a bit between how the different object file formats encode the relocations and symbol offsets. (IIRC ELF and MachO can say that a relocation is relative to a symbol with offset, while COFF only points at a symbol, and any offset must be applied via the instruction immediates.)
> > Okay, I see now. 
> > By the way I don't necessary need to decode addend in processRelocationRef function, it could be encoded before the relocation applied, am I right? Which is better? Since the inst contains the offset in the immediate part, so it have to be considered (add the offset to 'Value') right before immediate is rewritten.
> > I update the change later today. 
> > 
> > 
> > 
> I presume you meant "it could be decoded before the relocation applied", not encoded?
> 
> Yes, you could do the decode+update+encode all in one step. When reviewing I noted that this followed such a two-step style, but I presumed this came from general RuntimeDyld design. In lld it's all done in one single function. I'm not familiar with RuntimeDyld to say if there's any specific needs here for it to be this way, but if other parts of RuntimeDyld does it this way (in particular, other COFF architectures) it might be good to match the style.
Yes, that's what I meant. Thanks for your suggestion. Yes, I follow that style, since RuntimeDyld design follows a two-step style, but as far as I see there is no any style convention, for example in COFF Thumb a few instruction's addend are decoded, most of them are not. I don't see any significance to be this way, but I would like to do a clean approach for this.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69434/new/

https://reviews.llvm.org/D69434