[llvm-bugs] [Bug 49672] New: lld generates empty .got on aarch64, satisfies _GLOBAL_OFFSET_TABLE_ ref with fixed-ABI slots

via llvm-bugs llvm-bugs at lists.llvm.org
Sun Mar 21 18:15:16 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=49672

            Bug ID: 49672
           Summary: lld generates empty .got on aarch64, satisfies
                    _GLOBAL_OFFSET_TABLE_ ref with fixed-ABI slots
           Product: lld
           Version: unspecified
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: ELF
          Assignee: unassignedbugs at nondot.org
          Reporter: roland at hack.frob.com
                CC: haoweiwu1991 at gmail.com, llvm-bugs at lists.llvm.org,
                    maskray at google.com, phosek at chromium.org,
                    smithp352 at googlemail.com

The linker generates the `_GLOBAL_OFFSET_TABLE_` symbol to point at
linker-generated .got/.plt.got sections.  The ABI expectation is that
`_GLOBAL_OFFSET_TABLE_[0]` holds the unrelocated link-time address of
`_DYNAMIC` (the `.dynamic` section).
e.g. glibc's dynamic linker relies on this to compute its own load bias using a
standard technique.  

However, when lld emits an `ET_DYN` with no PLT entries and no other GOT
entries, it emits an empty `.got` section even if there is a reference to
`_GLOBAL_OFFSET_TABLE_`.  It resolves `_GLOBAL_OFFSET_TABLE_` to the address of
the empty .got section, so code reads a word from there expecting to get the
unrelocated `_DYNAMIC` address and instead gets whatever data word came after
the .got section in the link layout.

It so happens that glibc's ld.so always has some PLT entries, so this hasn't
bitten there.

I'm working around this using an input linker script:
```
SECTIONS {                                                                      
  .got.bug.workaround : { QUAD(_DYNAMIC); }                                     
} INSERT AFTER .got 
```

BFD ld always emits a .got with the standard first entry.

Gold actually fails to link in this case because it doesn't define
`_GLOBAL_OFFSET_TABLE_` at all when it doesn't generate any PLT or GOT entries.
 I think that's wrong since `_GLOBAL_OFFSET_TABLE_` references are a signal of
the ABI expectation of the linker-provided reserved slots.  But it's a more
principled stance than lld's behavior (even if it wasn't implemented on
purpose).  It at least makes clear that the reference won't work as intended by
giving a link-time failure, whereas lld just silently does something pretty
surprising for a `_GLOBAL_OFFSET_TABLE_` reference.

The simple reproducer is:
```
.text                                                                           
.globl _start                                                                   
.hidden _DYNAMIC                                                                
.hidden _GLOBAL_OFFSET_TABLE_                                                   
_start:                                                                         
adr x0, _GLOBAL_OFFSET_TABLE_                                                   
adr x1, _DYNAMIC                                                                
ldr x0, [x0]                                                                    
sub x0, x0, x1                                                                  
brk #1                                                                          
```

That can be assembled and linked with just `ld.lld -pie` or `ld.bfd -pie` for
aarch64 linux or fuchsia.
When executed it should hit the breakpoint with x0 containing its runtime load
bias.
With lld and without using the input linker script above as a workaround, it
will fail because the word it read at `_GLOBAL_OFFSET_TABLE_[0]` was not
supplied by the linker but was some other word of data or padding (likely
zero).

I think it's sensible enough to "optimize out" the .got when it's wholly
unused, but a reference to `_GLOBAL_OFFSET_TABLE_` should be a signal that the
ABI-specified fixed fields are in fact used and must be emitted even if there
is no other GOT or PLT content.

For comparison, consider the equivalent x86-64 case:
```
.text                                                                           
.globl _start                                                                   
.hidden _DYNAMIC                                                                
.hidden _GLOBAL_OFFSET_TABLE_                                                   
_start:                                                                         
movq _GLOBAL_OFFSET_TABLE_(%rip),%rax                                           
leaq _DYNAMIC(%rip),%rcx                                                        
sub %rcx,%rax                                                                   
ud2
```

This is deceptively similar to the aarch64 case, but on x86 (and maybe
others--but not aarch64) the assembler actually treats the symbol name
`_GLOBAL_OFFSET_TABLE_` specially and generates the special `R_X86_64_GOTPC32`
relocation instead of a normal `R_X86_64_PC32` like it does for any other
symbol (e.g. for `_DYNAMIC` above).  Since this reloc type specifically refers
to the GOT in the ABI, the linker takes the presence of any such reloc as a
signal that it must provide the ABI-required GOT slots.

However, the almost identical case:
```
.text                                                                           
.globl _start                                                                   
.hidden _DYNAMIC                                                                
.hidden GOT                                                                     
_start:                                                                         
movq GOT(%rip),%rax                                                             
leaq _DYNAMIC(%rip),%rcx                                                        
sub %rcx,%rax                                                                   
ud2
```
comes out differently from the assembler because it doesn't match the magic
symbol name: so there's a plain PC32 reloc for `GOT` here.
Then combine this with an input linker script:
```
PROVIDE_HIDDEN(GOT = _GLOBAL_OFFSET_TABLE_);
```
and `ld.lld -pie got.o got.ld` should have equivalent semantics to the aarch64
case you'd think.  The explicit match of the special reloc type isn't happening
now.  However, lld still produces a nonempty .got.plt with the correct
`_DYNAMIC` address populated here.  I don't know why this case doesn't hit the
same logic as aarch64 and omit the contents.  (I think it's wrong to do that,
but I'm confused about how the two cases differ in the lld implementation
today.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210322/a3b32a6e/attachment.html>


More information about the llvm-bugs mailing list