[lld] [lld][AArch64] Fix handling of SHT_REL relocation addends. (PR #98291)

Thu Jul 11 14:19:31 PDT 2024

================
@@ -239,30 +239,85 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_IRELATIVE:
   case R_AARCH64_TLS_TPREL64:
     return read64(buf);
+
+    // The following relocation types all point at instructions, and
+    // relocate an immediate field in the instruction.
+    //
+    // The general rule, from AAELF64 §5.7.2 "Addends and PC-bias",
+    // says: "If the relocation relocates an instruction the immediate
+    // field of the instruction is extracted, scaled as required by
+    // the instruction field encoding, and sign-extended to 64 bits".
+
+    // The R_AARCH64_MOVW family operates on wide MOV/MOVK/MOVZ
+    // instructions, which have a 16-bit immediate field with its low
+    // bit in bit 5 of the instruction encoding. When the immediate
+    // field is used as an implicit addend for REL-type relocations,
+    // it is treated as added to the low bits of the output value, not
+    // shifted depending on the relocation type.
+    //
+    // This allows REL relocations to express the requirement 'please
+    // add 12345 to this symbol value and give me the four 16-bit
+    // chunks of the result', by putting the same addend 12345 in all
+    // four instructions. Carries between the 16-bit chunks are
+    // handled correctly, because the whole 64-bit addition is done
+    // once per relocation.
   case R_AARCH64_MOVW_UABS_G0:
   case R_AARCH64_MOVW_UABS_G0_NC:
-    return getBits(SignExtend64<16>(read16(buf)), 0, 15);
+    return SignExtend64<16>(getBits(read32(buf), 5, 20));
   case R_AARCH64_MOVW_UABS_G1:
   case R_AARCH64_MOVW_UABS_G1_NC:
-    return getBits(SignExtend64<32>(read32(buf)), 16, 31);
+    return SignExtend64<16>(getBits(read32(buf), 5, 20));
   case R_AARCH64_MOVW_UABS_G2:
   case R_AARCH64_MOVW_UABS_G2_NC:
-    return getBits(read64(buf), 32, 47);
+    return SignExtend64<16>(getBits(read32(buf), 5, 20));
   case R_AARCH64_MOVW_UABS_G3:
-    return getBits(read64(buf), 48, 63);
+    return SignExtend64<16>(getBits(read32(buf), 5, 20));
----------------
smithp35 wrote:

The addend extraction code is identical for all of the MOV* relocations. I think it should be possible to do:
```
case R_AARCH64_MOVW_UABS_G0:
case R_AARCH64_MOVW_UABS_G0_NC:
case R_AARCH64_MOVW_UABS_G1:
case R_AARCH64_MOVW_UABS_G1_NC:
case R_AARCH64_MOVW_UABS_G2:
case R_AARCH64_MOVW_UABS_G2_NC:
case R_AARCH64_MOVW_UABS_G3:
  return SignExtend64<16>(getBits(read32(buf), 5, 20));
```

https://github.com/llvm/llvm-project/pull/98291