[all-commits] [llvm/llvm-project] a6b204: [lld][AArch64] Fix handling of SHT_REL relocation ...

Fri Jul 19 01:01:46 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: a6b204b82745764e1460aa1dc26e69ff73195c60
      https://github.com/llvm/llvm-project/commit/a6b204b82745764e1460aa1dc26e69ff73195c60
  Author: Simon Tatham <simon.tatham at arm.com>
  Date:   2024-07-19 (Fri, 19 Jul 2024)

  Changed paths:
    M lld/ELF/Arch/AArch64.cpp
    M lld/test/ELF/aarch64-reloc-implicit-addend.test

  Log Message:
  -----------
  [lld][AArch64] Fix handling of SHT_REL relocation addends. (#98291)

Normally, AArch64 ELF objects use the SHT_RELA type of relocation
section, with addends stored in each relocation. But some legacy AArch64
object producers still use SHT_REL in some situations, storing the
addend in the initial value of the data item or instruction immediate
field that the relocation will modify. LLD was mishandling relocations
of this type in multiple ways.

Firstly, many of the cases in the `getImplicitAddend` switch statement
were apparently based on a misunderstanding. The relocation types that
operate on instructions should be expecting to find an instruction of
the appropriate type, and should extract its immediate field. But many
of them were instead behaving as if they expected to find a raw 64-, 32-
or 16-bit value, and wanted to extract the right range of bits. For
example, the relocation for R_AARCH64_ADD_ABS_LO12_NC read a 16-bit word
and extracted its bottom 12 bits, presumably on the thinking that the
relocation writes the low 12 bits of the value it computes. But the
input addend for SHT_REL purposes occupies the immediate field of an
AArch64 ADD instruction, which meant it should have been reading a
32-bit AArch64 instruction encoding, and extracting bits 10-21 where the
immediate field lives. Worse, the R_AARCH64_MOVW_UABS_G2 relocation was
reading 64 bits from the input section, and since it's only relocating a
32-bit instruction, the second half of those bits would have been
completely unrelated!

Adding to that confusion, most of the values being read were first
sign-extended, and _then_ had a range of bits extracted, which doesn't
make much sense. They should have first extracted some bits from the
instruction encoding, and then sign-extended that 12-, 19-, or 21-bit
result (or whatever else) to a full 64-bit value.

Secondly, after the relocated value was computed, in most cases it was
being written into the target instruction field via a bitwise OR
operation. This meant that if the instruction field didn't initially
contain all zeroes, the wrong result would end up in it. That's not even
a 100% reliable strategy for SHT_RELA, which in some situations is used
for its repeatability (in the sense that applying the relocation twice
should cause the second answer to overwrite the first, so you can
relocate an image in advance to its most likely address, and then do it
again at load time if that turns out not to be available). But for
SHT_REL, when you expect nonzero immediate fields in normal use, it
couldn't possibly work. You could see the effect of this in the existing
test, which had a lot of FFFFFF in the expected output which there
wasn't any plausible justification for.

Finally, one relocation type was actually missing: there was no support
for R_AARCH64_ADR_PREL_LO21 at all.

So I've rewritten most of the cases in `getImplicitAddend`; replaced the
bitwise ORs with overwrites; and replaced the previous test with a much
more thorough one, obtained by writing an input assembly file with
explicitly specified relocations on instructions that also have
carefully selected immediate fields, and then doing some yaml2obj
seddery to turn the RELA relocation section into a REL one.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications