[PATCH] D157547: Arm64EC entry/exit thunks, consolidated.

Jacek Caban via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 17 03:50:29 PDT 2023


jacek added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64MCInstLower.cpp:51-54
+    // For ARM64EC, symbol lookup in the MSVC linker has limited awareness
+    // of ARM64EC mangling ("#"/"$$h"). So object files need to refer to both
+    // the mangled and unmangled names of ARM64EC symbols, even if they aren't
+    // actually used by any relocations. Emit the necessary references here.
----------------
efriedma wrote:
> jacek wrote:
> > I think that mangled weak symbol should link to another kind exit thunk, not to unmangled symbol directly.
> > 
> > When an extern code symbol is resolved by an unmangled name, it means that we have ARM64EC->X64 call and we need to use an exit thunk. Linker doesn't need a special logic for that: on MSVC it seems to be achieved by pointing the antidependency symbol to yet another exit thunk. That other exit thunk loads the target exit thunk and X64 (unmangled) symbol into x10, x11 and uses __os_arm64x_dispatch_icall to call emulator. I guess we may worry about that later, but in that case maybe it's better not to emit mangled antidependency symbol at all for now?
> From what I recall, I wrote the code here primarily to deal with issues trying to take the address of a function; even if the function is defined in arm64ec code, the address is the unmangled symbol.  So there's some weirdness there involving symbol lookup even before there's any actual x64 code involved.
> 
> If you have a better idea of how external symbol references are supposed to be emitted, I'd appreciate a brief writeup, since all the information I have comes from trying to read dumpbin dumps of MSVC output.
Yes, when EC code takes an address of a function, it needs to use unmangled symbols, because mangled symbols may point to a thunk (in cases when the implementation is in X64 or __declspec(hybrid_patchable) is used) and we still want an address of the real implementation. Here is an example how MSVC sets it up:

$ cat test.c
extern int otherfunc(void);
int myfunc(void) { return otherfunc(); }
$ llvm-readobj --symbols test.obj
...
  Symbol {
    Name: #otherfunc$exit_thunk
    Value: 0
    Section: .wowthk$aa (4)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  }
...
  Symbol {
    Name: #otherfunc
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #otherfunc$exit_thunk (14)
      Search: AntiDependency (0x4)
    }
  }
  Symbol {
    Name: #myfunc
    Value: 0
    Section: .text$mn (3)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  }
  Symbol {
    Name: otherfunc
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #otherfunc (19)
      Search: AntiDependency (0x4)
    }
  }
  Symbol {
    Name: myfunc
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #myfunc (21)
      Search: AntiDependency (0x4)
    }
  }

In this example myfunc links to #myfunc (and similarly otherfunc->#otherfunc). That's enough because even if we pass a pointer of #myfunc to actual X64 code which will try to call this address, emulator will have a chance to figure it out and use entry thunk as needed.

However, #otherfunc points to #otherfunc$exit_thunk (which then references otherfunc and $iexit_thunk$cdecl$i8$v in its implementation). If otherfunc is implemented in ARM64EC, #otherfunc symbol will be replaced by the object file implementing it. If it will not be replaced, it means that symbol is resolved to X64 implementation and the linked thunk will be used to call it.

My information are based on experimentation with MSVC as well, so those are only my best guesses. I mostly experimented with it in a context of linker support and I have those aspects of linker working in my tree. I will work on more comprehensive description of those things.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157547/new/

https://reviews.llvm.org/D157547



More information about the cfe-commits mailing list