[lld] [llvm] RFC: [LLD] [COFF] Fix linking MSVC generated implib header objects (PR #122811)

Martin Storsjö via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 16 04:52:19 PST 2025


mstorsjo wrote:

A couple more observations about the MSVC behaviour here. I used inputs like this:
```yaml
#--- main.yaml
--- !COFF
header:
  Machine:         IMAGE_FILE_MACHINE_AMD64
  Characteristics: [  ]
sections:            
  - Name:            '.itest$2'
    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]     
    Alignment:       4
    SectionData:     '00000000000000000000'
    SizeOfRawData:   10
    Relocations:
      - VirtualAddress:  0
        SymbolName:      '.itest$6'
        Type:            IMAGE_REL_AMD64_ADDR32NB
  - Name:            '.itest$6'
    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]     
    Alignment:       2
    SectionData:     01000000
    SizeOfRawData:   4
symbols:
  - Name:            '.itest$2'
    Value:           0
    SectionNumber:   1
    SimpleType:      IMAGE_SYM_TYPE_NULL
    ComplexType:     IMAGE_SYM_DTYPE_NULL
    StorageClass:    IMAGE_SYM_CLASS_SECTION
  - Name:            '.itest$6'
    Value:           3221225536
    SectionNumber:   2
    SimpleType:      IMAGE_SYM_TYPE_NULL
    ComplexType:     IMAGE_SYM_DTYPE_NULL
    StorageClass:    IMAGE_SYM_CLASS_SECTION
  - Name:            '.itest$4'
    Value:           3221225536
    SectionNumber:   0
    SimpleType:      IMAGE_SYM_TYPE_NULL
    ComplexType:     IMAGE_SYM_DTYPE_NULL
    StorageClass:    IMAGE_SYM_CLASS_SECTION
...
#--- sect-def.yaml
--- !COFF
header:
  Machine:         IMAGE_FILE_MACHINE_AMD64
  Characteristics: [  ]
sections:            
  - Name:            '.itest$4'
    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
    Alignment:       2
    SectionData:     'ffff'
    SizeOfRawData:   2
symbols:
  - Name:            '.itest$4'
    Value:           0
    SectionNumber:   1
    SimpleType:      IMAGE_SYM_TYPE_NULL
    ComplexType:     IMAGE_SYM_DTYPE_NULL
    StorageClass:    IMAGE_SYM_CLASS_SECTION
...
```
Compared with earlier tests, I'm adding a second yaml object for the actual `.itest$4` section. I've changed `.itest$6` to `IMAGE_SYM_CLASS_SECTION` whereas it usually seems to be `IMAGE_SYM_CLASS_STATIC` for `.idata$6` in the case of actual import libraries. I'm converting yaml->obj and then passing both .obj to MS link.exe.

- MS link.exe behaves weirdly with relocations against the section symbols here. Both `IMAGE_REL_AMD64_ADDR32NB` and `IMAGE_REL_AMD64_ADDR64` seem to behave in the same way. With this example as is, I'm getting the address of the start of the `.itest` section, not the start of the `.itest$6` chunk. Same if I point the relocation at the `.itest$4` symbol. If I rename the `.itest` section to `.idata`, I get the correct address for the `.itest$6` chunk, but I get a plain zero (for ADDR32NB) and the image base (for ADDR64) if I point the relocation at `.itest$4`. It feels like something is really odd here; I would expect that there are relocations against section symbols in many everyday object files (plus in the import library headers), so these really do need to work correctly in general. No idea what odd thing we have there that makes link.exe misbehave...

- The `Value` of symbols of type `IMAGE_SYM_CLASS_SECTION` is not an offset, but it is the `Characteristics` of the section. The magic value we're seeing, `3221225536` aka `0x0xC0000040` is not STATUS_SECTION_TOO_BIG, but it is `IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_WRITE`. If the field is zero, it is ignored, but if not and it doesn't match (either a real section header in the same object file, or a section header in another object file), MS link.exe prints a warning like this: `warning LNK4078: multiple '.idata' sections found with different attributes (40000040)`

For the former issue, I'm not sure if there's much of an action we need to take; it is very distracting when trying to figure out how things really work, but I'm not sure if resolving this is relevant for this case.

For the latter issue, we should at least ignore the `Value` field for symbols of type `IMAGE_SYM_CLASS_SECTION`, instead of treating it as an offset. (We might not need to actually try to apply it as section characteristics.) I'm not sure how to handle this though... When we create a `DefinedRegular` for these symbols, we don't really distinguish between symbol types `IMAGE_SYM_CLASS_SECTION` vs e.g. `IMAGE_SYM_CLASS_STATIC`. And within `DefinedRegular`, we access them via `coff_symbol_generic` which doesn't see the `StorageClass` field.

Should we add a `DefinedRegular` subclass which ignores the offset? That's maybe the most efficient way around it, without needing to litter `DefinedRegular` with a condition in e.g. the `getRVA()` method. (Alternatively, `DefinedRegular` would need a flag about whether to ignore the `Value` offset or not.)

https://github.com/llvm/llvm-project/pull/122811


More information about the llvm-commits mailing list