[lld] [LLD][COFF] Allow overriding EC alias symbols with lazy archive symbols (PR #113283)

Jacek Caban via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 22 10:24:25 PDT 2024


================
@@ -455,11 +455,34 @@ void ObjFile::initializeSymbols() {
     COFFSymbolRef coffSym = check(coffObj->getSymbol(i));
     bool prevailingComdat;
     if (coffSym.isUndefined()) {
-      symbols[i] = createUndefined(coffSym);
+      symbols[i] = createUndefined(coffSym, false);
     } else if (coffSym.isWeakExternal()) {
-      symbols[i] = createUndefined(coffSym);
-      weakAliases.emplace_back(symbols[i],
-                               coffSym.getAux<coff_aux_weak_external>());
+      auto aux = coffSym.getAux<coff_aux_weak_external>();
+      bool overrideLazy = true;
+
+      // On ARM64EC, external functions don't emit undefined symbols. Instead,
----------------
cjacek wrote:

Sure, I will adjust the commit, but here is a long version. Let’s walk through an example:
```
$ cat test.c
extern void func(void);
void caller(void) { func(); }
```

On a typical target, `func` would be an undefined symbol:
```
$ clang -c test.c -target aarch64-windows
$ llvm-readobj --symbols test.o
...
  Symbol {
    Name: func
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  }   
...
```

However, that’s not the case on ARM64EC:
```
$ clang -c test.c -target arm64ec-windows
$ llvm-readobj --symbols test.o
...
  Symbol {
    Name: #func$exit_thunk
    Value: 0
    Section: .wowthk$aa (7)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  } 
...
  Symbol {
    Name: #func
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #func$exit_thunk (23)
      Search: AntiDependency (0x4)
    }
  } 
...
  Symbol {
    Name: func
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #func (41)
      Search: AntiDependency (0x4)
    }
  }
```

The reason for this is that the compiler doesn't know whether the callee will be an x86_64 function (in which case it will define the `func` symbol) or ARM64EC (in which case it will define the `#func` symbol). This approach works seamlessly when `func` is defined in another object file.

However, the usual rule is that weak externals (including anti-dependency symbols) take precedence over archive symbols. This would prevent both `func` and `#func` from being resolved to archive symbols. ARM64EC changes these rules to handle this scenario.

Additionally, function definitions also emit an anti-dependency symbol. Using `caller` from the same example:
```
$ llvm-readobj --symbols test.o
...
  Symbol {
    Name: #caller
    Value: 0
    Section: .text (4)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  } 
...
  Symbol {
    Name: caller
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #caller (8)
      Search: AntiDependency (0x4)
    }
  } 
...
```

In addition to the defined `#caller` symbol, `caller` is defined as an alias (hybrid patchable functions would behave differently here). This alias is part of the implementation and should take precedence over archive symbols.

Weak aliases are not included in the archive index on any target. Therefore, if I were to create an archive containing the above `test.o`, the index would only include the `#func` symbol (this part is important for the other PR).

https://github.com/llvm/llvm-project/pull/113283


More information about the llvm-commits mailing list