[lld] [LLD][COFF] Add support for including native ARM64 objects in ARM64EC images (PR #137653)

Mon May 12 07:14:14 PDT 2025

cjacek wrote:

> However I'm not sure I understand the full context of this change. Is this input case something that we do expect that we'd ever want to do? (In my understanding of arm64ec/arm64x use cases, I don't see when we'd ever want to do this?) Doesn't this just cause a quite confusing situation where you can link in object files which do end up in the final image, but which are pretty much inactive (not referenced, not used at runtime at all)? As I don't see the real use case, I also feel a bit more hesitant about a change like this which is yet another small step towards a more complex internal linker state.

As I mentioned in the description, I think the usefulness of this feature is questionable, but it's how MSVC behaves. I encountered it a few times during testing, mainly because most linker-related aspects of EC are undocumented. I had to resort to throwing various test cases at MSVC to see how it behaves, and ideally, infer the intent behind that behavior.

The most recent case where I observed this was while investigating the undocumented `-arm64xsameaddress` for issue #131712. One hypothesis was that it allows referencing native aarch64 code from EC code. I now know that’s not the case. `-arm64xsameaddress` is a no-op on EC targets. It only has meaning for hybrid images, where it replaces function symbols in both symbol tables with a single thunk that jumps to the actual implementation (so the function has the same address in both views). In that case, additional ARM64X relocs are emitted so that the thunk jumps to the EC variant in the EC view. On pure EC targets, such a thunk can also be emitted if you pass a native aarch64 object file, but it’s not particularly useful there.

Another situation where I encountered native objects was with import libraries. ARM64EC import libraries use native object files for import descriptors. This doesn’t affect LLD much since we synthesize them anyway, but MSVC appears to rely on them (see #84834), so it needs to support native object files.

I don’t have a strong real-world example that requires this, and in fact, using native objects with code is probably almost always a bad idea. One acceptable case might be resource object files. Hypothetically, someone could use an aarch64 build system to build ARM64EC binaries by passing something like `CFLAGS=-arm64EC`. That wouldn't affect `cvtres`, so its output would remain a native aarch64 object. MSVC handles that fine, but current LLD would reject it.

That said, I don’t have a strong reason why we need to support this. Now that we have ARM64X support, it was easy to implement when I came across it again. My thought was that it might be a good time to finalize this part of the code while it’s still fresh. But if you don’t think it’s worthwhile, I will skip it.

> I'm also a little curious about how you managed to do this change, to check all changes (e.g. regarding delay imports), as we don't really exercise this new case in any of the existing tests? But I guess the fact that you always create two symbol tables even if we essentially never use one of them, causes us to force testing all codepaths in the existing tests anyway...

I grepped the source code for all symbol table references, adjusted them as needed, and reviewed their associated test cases where appropriate. Then I did the reverse: I went through all ARM64X test cases to identify ones where it made sense to add an ARM64EC variant.

Delay-load is covered by `arm64ec-delayimport.test`. If we didn’t skip the native part when it’s missing, the RVAs would shift due to the presence of null chunks. Also, without adjusting how `__delayLoadHelper2` is handled, LLD would either crash or fail to locate the helper, depending on whether we tried to pull it in.

https://github.com/llvm/llvm-project/pull/137653