[lld] a6b204b - [lld][AArch64] Fix handling of SHT_REL relocation addends. (#98291)

Fri Jul 19 01:01:28 PDT 2024

Author: Simon Tatham
Date: 2024-07-19T09:01:25+01:00
New Revision: a6b204b82745764e1460aa1dc26e69ff73195c60

URL: https://github.com/llvm/llvm-project/commit/a6b204b82745764e1460aa1dc26e69ff73195c60
DIFF: https://github.com/llvm/llvm-project/commit/a6b204b82745764e1460aa1dc26e69ff73195c60.diff

LOG: [lld][AArch64] Fix handling of SHT_REL relocation addends. (#98291)

Normally, AArch64 ELF objects use the SHT_RELA type of relocation
section, with addends stored in each relocation. But some legacy AArch64
object producers still use SHT_REL in some situations, storing the
addend in the initial value of the data item or instruction immediate
field that the relocation will modify. LLD was mishandling relocations
of this type in multiple ways.

Firstly, many of the cases in the `getImplicitAddend` switch statement
were apparently based on a misunderstanding. The relocation types that
operate on instructions should be expecting to find an instruction of
the appropriate type, and should extract its immediate field. But many
of them were instead behaving as if they expected to find a raw 64-, 32-
or 16-bit value, and wanted to extract the right range of bits. For
example, the relocation for R_AARCH64_ADD_ABS_LO12_NC read a 16-bit word
and extracted its bottom 12 bits, presumably on the thinking that the
relocation writes the low 12 bits of the value it computes. But the
input addend for SHT_REL purposes occupies the immediate field of an
AArch64 ADD instruction, which meant it should have been reading a
32-bit AArch64 instruction encoding, and extracting bits 10-21 where the
immediate field lives. Worse, the R_AARCH64_MOVW_UABS_G2 relocation was
reading 64 bits from the input section, and since it's only relocating a
32-bit instruction, the second half of those bits would have been
completely unrelated!

Adding to that confusion, most of the values being read were first
sign-extended, and _then_ had a range of bits extracted, which doesn't
make much sense. They should have first extracted some bits from the
instruction encoding, and then sign-extended that 12-, 19-, or 21-bit
result (or whatever else) to a full 64-bit value.

Secondly, after the relocated value was computed, in most cases it was
being written into the target instruction field via a bitwise OR
operation. This meant that if the instruction field didn't initially
contain all zeroes, the wrong result would end up in it. That's not even
a 100% reliable strategy for SHT_RELA, which in some situations is used
for its repeatability (in the sense that applying the relocation twice
should cause the second answer to overwrite the first, so you can
relocate an image in advance to its most likely address, and then do it
again at load time if that turns out not to be available). But for
SHT_REL, when you expect nonzero immediate fields in normal use, it
couldn't possibly work. You could see the effect of this in the existing
test, which had a lot of FFFFFF in the expected output which there
wasn't any plausible justification for.

Finally, one relocation type was actually missing: there was no support
for R_AARCH64_ADR_PREL_LO21 at all.

So I've rewritten most of the cases in `getImplicitAddend`; replaced the
bitwise ORs with overwrites; and replaced the previous test with a much
more thorough one, obtained by writing an input assembly file with
explicitly specified relocations on instructions that also have
carefully selected immediate fields, and then doing some yaml2obj
seddery to turn the RELA relocation section into a REL one.

Added: 
    

Modified: 
    lld/ELF/Arch/AArch64.cpp
    lld/test/ELF/aarch64-reloc-implicit-addend.test

Removed: 
    


################################################################################
diff  --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index cf5c2380690f1..0106349b2d277 100644

--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -239,30 +239,81 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_IRELATIVE:
   case R_AARCH64_TLS_TPREL64:
     return read64(buf);
+
+    // The following relocation types all point at instructions, and
+    // relocate an immediate field in the instruction.
+    //
+    // The general rule, from AAELF64 §5.7.2 "Addends and PC-bias",
+    // says: "If the relocation relocates an instruction the immediate
+    // field of the instruction is extracted, scaled as required by
+    // the instruction field encoding, and sign-extended to 64 bits".
+
+    // The R_AARCH64_MOVW family operates on wide MOV/MOVK/MOVZ
+    // instructions, which have a 16-bit immediate field with its low
+    // bit in bit 5 of the instruction encoding. When the immediate
+    // field is used as an implicit addend for REL-type relocations,
+    // it is treated as added to the low bits of the output value, not
+    // shifted depending on the relocation type.
+    //
+    // This allows REL relocations to express the requirement 'please
+    // add 12345 to this symbol value and give me the four 16-bit
+    // chunks of the result', by putting the same addend 12345 in all
+    // four instructions. Carries between the 16-bit chunks are
+    // handled correctly, because the whole 64-bit addition is done
+    // once per relocation.
   case R_AARCH64_MOVW_UABS_G0:
   case R_AARCH64_MOVW_UABS_G0_NC:
-    return getBits(SignExtend64<16>(read16(buf)), 0, 15);
   case R_AARCH64_MOVW_UABS_G1:
   case R_AARCH64_MOVW_UABS_G1_NC:
-    return getBits(SignExtend64<32>(read32(buf)), 16, 31);
   case R_AARCH64_MOVW_UABS_G2:
   case R_AARCH64_MOVW_UABS_G2_NC:
-    return getBits(read64(buf), 32, 47);
   case R_AARCH64_MOVW_UABS_G3:
-    return getBits(read64(buf), 48, 63);
+    return SignExtend64<16>(getBits(read32(buf), 5, 20));
+
+    // R_AARCH64_TSTBR14 points at a TBZ or TBNZ instruction, which
+    // has a 14-bit offset measured in instructions, i.e. shifted left
+    // by 2.
   case R_AARCH64_TSTBR14:
-    return getBits(SignExtend64<32>(read32(buf)), 2, 15);
+    return SignExtend64<16>(getBits(read32(buf), 5, 18) << 2);
+
+    // R_AARCH64_CONDBR19 operates on the ordinary B.cond instruction,
+    // which has a 19-bit offset measured in instructions.
+    //
+    // R_AARCH64_LD_PREL_LO19 operates on the LDR (literal)
+    // instruction, which also has a 19-bit offset, measured in 4-byte
+    // chunks. So the calculation is the same as for
+    // R_AARCH64_CONDBR19.
   case R_AARCH64_CONDBR19:
   case R_AARCH64_LD_PREL_LO19:
-    return getBits(SignExtend64<32>(read32(buf)), 2, 20);
+    return SignExtend64<21>(getBits(read32(buf), 5, 23) << 2);
+
+    // R_AARCH64_ADD_ABS_LO12_NC operates on ADD (immediate). The
+    // immediate can optionally be shifted left by 12 bits, but this
+    // relocation is intended for the case where it is not.
   case R_AARCH64_ADD_ABS_LO12_NC:
-    return getBits(SignExtend64<16>(read16(buf)), 0, 11);
+    return SignExtend64<12>(getBits(read32(buf), 10, 21));
+
+    // R_AARCH64_ADR_PREL_LO21 operates on an ADR instruction, whose
+    // 21-bit immediate is split between two bits high up in the word
+    // (in fact the two _lowest_ order bits of the value) and 19 bits
+    // lower down.
+    //
+    // R_AARCH64_ADR_PREL_PG_HI21[_NC] operate on an ADRP instruction,
+    // which encodes the immediate in the same way, but will shift it
+    // left by 12 bits when the instruction executes. For the same
+    // reason as the MOVW family, we don't apply that left shift here.
+  case R_AARCH64_ADR_PREL_LO21:
   case R_AARCH64_ADR_PREL_PG_HI21:
   case R_AARCH64_ADR_PREL_PG_HI21_NC:
-    return getBits(SignExtend64<32>(read32(buf)), 12, 32);
+    return SignExtend64<21>((getBits(read32(buf), 5, 23) << 2) |
+                            getBits(read32(buf), 29, 30));
+
+    // R_AARCH64_{JUMP,CALL}26 operate on B and BL, which have a
+    // 26-bit offset measured in instructions.
   case R_AARCH64_JUMP26:
   case R_AARCH64_CALL26:
-    return getBits(SignExtend64<32>(read32(buf)), 2, 27);
+    return SignExtend64<28>(getBits(read32(buf), 0, 25) << 2);
+
   default:
     internalLinkerError(getErrorLocation(buf),
                         "cannot read addend for relocation " + toString(type));
@@ -366,11 +417,13 @@ static void write32AArch64Addr(uint8_t *l, uint64_t imm) {
   write32le(l, (read32le(l) & ~mask) | immLo | immHi);
 }
 
-static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) | v); }
+static void writeMaskedBits32le(uint8_t *p, int32_t v, uint32_t mask) {
+  write32le(p, (read32le(p) & ~mask) | v);
+}
 
 // Update the immediate field in a AARCH64 ldr, str, and add instruction.
-static void or32AArch64Imm(uint8_t *l, uint64_t imm) {
-  or32le(l, (imm & 0xFFF) << 10);
+static void write32Imm12(uint8_t *l, uint64_t imm) {
+  writeMaskedBits32le(l, (imm & 0xFFF) << 10, 0xFFF << 10);
 }
 
 // Update the immediate field in an AArch64 movk, movn or movz instruction
@@ -443,7 +496,7 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
       write32(loc, val);
     break;
   case R_AARCH64_ADD_ABS_LO12_NC:
-    or32AArch64Imm(loc, val);
+    write32Imm12(loc, val);
     break;
   case R_AARCH64_ADR_GOT_PAGE:
   case R_AARCH64_ADR_PREL_PG_HI21:
@@ -470,28 +523,28 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
     [[fallthrough]];
   case R_AARCH64_CALL26:
     checkInt(loc, val, 28, rel);
-    or32le(loc, (val & 0x0FFFFFFC) >> 2);
+    writeMaskedBits32le(loc, (val & 0x0FFFFFFC) >> 2, 0x0FFFFFFC >> 2);
     break;
   case R_AARCH64_CONDBR19:
   case R_AARCH64_LD_PREL_LO19:
   case R_AARCH64_GOT_LD_PREL19:
     checkAlignment(loc, val, 4, rel);
     checkInt(loc, val, 21, rel);
-    or32le(loc, (val & 0x1FFFFC) << 3);
+    writeMaskedBits32le(loc, (val & 0x1FFFFC) << 3, 0x1FFFFC << 3);
     break;
   case R_AARCH64_LDST8_ABS_LO12_NC:
   case R_AARCH64_TLSLE_LDST8_TPREL_LO12_NC:
-    or32AArch64Imm(loc, getBits(val, 0, 11));
+    write32Imm12(loc, getBits(val, 0, 11));
     break;
   case R_AARCH64_LDST16_ABS_LO12_NC:
   case R_AARCH64_TLSLE_LDST16_TPREL_LO12_NC:
     checkAlignment(loc, val, 2, rel);
-    or32AArch64Imm(loc, getBits(val, 1, 11));
+    write32Imm12(loc, getBits(val, 1, 11));
     break;
   case R_AARCH64_LDST32_ABS_LO12_NC:
   case R_AARCH64_TLSLE_LDST32_TPREL_LO12_NC:
     checkAlignment(loc, val, 4, rel);
-    or32AArch64Imm(loc, getBits(val, 2, 11));
+    write32Imm12(loc, getBits(val, 2, 11));
     break;
   case R_AARCH64_LDST64_ABS_LO12_NC:
   case R_AARCH64_LD64_GOT_LO12_NC:
@@ -499,37 +552,39 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
   case R_AARCH64_TLSLE_LDST64_TPREL_LO12_NC:
   case R_AARCH64_TLSDESC_LD64_LO12:
     checkAlignment(loc, val, 8, rel);
-    or32AArch64Imm(loc, getBits(val, 3, 11));
+    write32Imm12(loc, getBits(val, 3, 11));
     break;
   case R_AARCH64_LDST128_ABS_LO12_NC:
   case R_AARCH64_TLSLE_LDST128_TPREL_LO12_NC:
     checkAlignment(loc, val, 16, rel);
-    or32AArch64Imm(loc, getBits(val, 4, 11));
+    write32Imm12(loc, getBits(val, 4, 11));
     break;
   case R_AARCH64_LD64_GOTPAGE_LO15:
     checkAlignment(loc, val, 8, rel);
-    or32AArch64Imm(loc, getBits(val, 3, 14));
+    write32Imm12(loc, getBits(val, 3, 14));
     break;
   case R_AARCH64_MOVW_UABS_G0:
     checkUInt(loc, val, 16, rel);
     [[fallthrough]];
   case R_AARCH64_MOVW_UABS_G0_NC:
-    or32le(loc, (val & 0xFFFF) << 5);
+    writeMaskedBits32le(loc, (val & 0xFFFF) << 5, 0xFFFF << 5);
     break;
   case R_AARCH64_MOVW_UABS_G1:
     checkUInt(loc, val, 32, rel);
     [[fallthrough]];
   case R_AARCH64_MOVW_UABS_G1_NC:
-    or32le(loc, (val & 0xFFFF0000) >> 11);
+    writeMaskedBits32le(loc, (val & 0xFFFF0000) >> 11, 0xFFFF0000 >> 11);
     break;
   case R_AARCH64_MOVW_UABS_G2:
     checkUInt(loc, val, 48, rel);
     [[fallthrough]];
   case R_AARCH64_MOVW_UABS_G2_NC:
-    or32le(loc, (val & 0xFFFF00000000) >> 27);
+    writeMaskedBits32le(loc, (val & 0xFFFF00000000) >> 27,
+                        0xFFFF00000000 >> 27);
     break;
   case R_AARCH64_MOVW_UABS_G3:
-    or32le(loc, (val & 0xFFFF000000000000) >> 43);
+    writeMaskedBits32le(loc, (val & 0xFFFF000000000000) >> 43,
+                        0xFFFF000000000000 >> 43);
     break;
   case R_AARCH64_MOVW_PREL_G0:
   case R_AARCH64_MOVW_SABS_G0:
@@ -562,15 +617,15 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
     break;
   case R_AARCH64_TSTBR14:
     checkInt(loc, val, 16, rel);
-    or32le(loc, (val & 0xFFFC) << 3);
+    writeMaskedBits32le(loc, (val & 0xFFFC) << 3, 0xFFFC << 3);
     break;
   case R_AARCH64_TLSLE_ADD_TPREL_HI12:
     checkUInt(loc, val, 24, rel);
-    or32AArch64Imm(loc, val >> 12);
+    write32Imm12(loc, val >> 12);
     break;
   case R_AARCH64_TLSLE_ADD_TPREL_LO12_NC:
   case R_AARCH64_TLSDESC_ADD_LO12:
-    or32AArch64Imm(loc, val);
+    write32Imm12(loc, val);
     break;
   case R_AARCH64_TLSDESC:
     // For R_AARCH64_TLSDESC the addend is stored in the second 64-bit word.

diff  --git a/lld/test/ELF/aarch64-reloc-implicit-addend.test b/lld/test/ELF/aarch64-reloc-implicit-addend.test
index 804ed97a27371..23fc9209d2201 100644
--- a/lld/test/ELF/aarch64-reloc-implicit-addend.test
+++ b/lld/test/ELF/aarch64-reloc-implicit-addend.test
@@ -1,97 +1,316 @@
-## Test certain REL relocation types generated by legacy armasm.
-# RUN: yaml2obj %s -o %t.o
-# RUN: ld.lld %t.o -o %t
-# RUN: llvm-objdump -s %t | FileCheck %s
-
-# CHECK:      Contents of section .abs:
-# CHECK-NEXT:  [[#%x,]] 29002800 00002700 00000000 0000fcff  ).(...'.........
-# CHECK-NEXT:  [[#%x,]] ffffffff ffff                        ......
-# CHECK-NEXT: Contents of section .uabs:
-# CHECK-NEXT:  [[#%x,]] 40ffffff 40ffffff 20ffffff 20ffffff  @... at ... ... ...
-# CHECK-NEXT:  [[#%x,]] 00ffffff 00ffffff                    ........
-# CHECK-NEXT: Contents of section .prel:
-# CHECK-NEXT:  [[#%x,]] 00ffffff fcfeffff f8feffff a0ffffff  ................
-# CHECK-NEXT:  [[#%x,]] 0010009f 0010009f                    ........
-# CHECK-NEXT: Contents of section .branch:
-# CHECK-NEXT:  [[#%x,]] f0ffffff f0ffffff fdffffff fcffff14  ................
-
----
-!ELF
-FileHeader:
-  Class: ELFCLASS64
-  Data: ELFDATA2LSB
-  Type: ET_REL
-  Machine: EM_AARCH64
-Sections:
-  - Name:    .abs
-    Type:    SHT_PROGBITS
-    Flags:   [ SHF_ALLOC ]
-    Content: fffffefffffffdfffffffffffffffcffffffffffffff
-  - Name:    .rel.abs
-    Type:    SHT_REL
-    Link:    .symtab
-    Info:    .abs
-    Relocations:
-      - {Offset: 0, Symbol: abs, Type: R_AARCH64_ABS16}
-      - {Offset: 2, Symbol: abs, Type: R_AARCH64_ABS32}
-      - {Offset: 6, Symbol: abs, Type: R_AARCH64_ABS64}
-      - {Offset: 14, Symbol: abs, Type: R_AARCH64_ADD_ABS_LO12_NC}
-
-  - Name:    .uabs
-    Type:    SHT_PROGBITS
-    Flags:   [ SHF_ALLOC ]
-    AddressAlign: 4
-    Content: 00ffffff00ffffff00ffffff00ffffff00ffffff00ffffff
-  - Name:    .rel.uabs
-    Type:    SHT_REL
-    Link:    .symtab
-    Info:    .uabs
-    Relocations:
-      - {Offset:  0, Symbol: abs, Type: R_AARCH64_MOVW_UABS_G0}
-      - {Offset:  4, Symbol: abs, Type: R_AARCH64_MOVW_UABS_G0_NC}
-      - {Offset:  8, Symbol: abs, Type: R_AARCH64_MOVW_UABS_G1}
-      - {Offset: 12, Symbol: abs, Type: R_AARCH64_MOVW_UABS_G1_NC}
-      - {Offset: 16, Symbol: abs, Type: R_AARCH64_MOVW_UABS_G2}
-      - {Offset: 20, Symbol: abs, Type: R_AARCH64_MOVW_UABS_G2_NC}
-
-  - Name:    .prel
-    Type:    SHT_PROGBITS
-    Flags:   [ SHF_ALLOC ]
-    AddressAlign: 4
-    Content: 00ffffff00ffffff00ffffff00ffffff00ffffff00ffffff
-  - Name:    .rel.prel
-    Type:    SHT_REL
-    Link:    .symtab
-    Info:    .prel
-    Relocations:
-      - {Offset:  0, Symbol: .prel, Type: R_AARCH64_PREL64}
-      - {Offset:  4, Symbol: .prel, Type: R_AARCH64_PREL32}
-      - {Offset:  8, Symbol: .prel, Type: R_AARCH64_PREL16}
-      - {Offset: 12, Symbol: .prel, Type: R_AARCH64_LD_PREL_LO19}
-      - {Offset: 16, Symbol: .prel, Type: R_AARCH64_ADR_PREL_PG_HI21}
-      - {Offset: 20, Symbol: .prel, Type: R_AARCH64_ADR_PREL_PG_HI21_NC}
-
-  - Name:    .branch
-    Type:    SHT_PROGBITS
-    Flags:   [ SHF_ALLOC ]
-    AddressAlign: 4
-    Content: f0fffffff0fffffff0fffffff0ffffff
-  - Name:    .rel.branch
-    Type:    SHT_REL
-    Link:    .symtab
-    Info:    .branch
-    Relocations:
-      - {Offset:  0, Symbol: .branch, Type: R_AARCH64_TSTBR14}
-      - {Offset:  4, Symbol: .branch, Type: R_AARCH64_CONDBR19}
-      - {Offset:  8, Symbol: .branch, Type: R_AARCH64_CALL26}
-      - {Offset: 12, Symbol: .branch, Type: R_AARCH64_JUMP26}
-
-Symbols:
-  - Name:    .branch
-    Section: .branch
-  - Name:    .prel
-    Section: .prel
-  - Name:    abs
-    Index:   SHN_ABS
-    Value:   42
-    Binding: STB_GLOBAL
+REQUIRES: aarch64
+
+## Test handling of addends taken from the relocated word or instruction
+## in AArch64 relocation sections of type SHT_REL. These can be generated
+## by assemblers other than LLVM, in particular the legacy 'armasm'.
+##
+## llvm-mc will only generate SHT_RELA when targeting AArch64. So to make
+## an input file with SHT_REL, we assemble our test source file, then
+## round-trip via YAML and do some seddery to change the type of the
+## relocation section. Since all the relocations were made manually with
+## .reloc directives containing no addend, this succeeds.
+
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=aarch64 relocs.s -o rela.o
+# RUN: obj2yaml rela.o -o rela.yaml
+# RUN: sed "s/\.rela/\.rel/;s/SHT_RELA/SHT_REL/" rela.yaml > rel.yaml
+# RUN: yaml2obj rel.yaml -o rel.o
+# RUN: llvm-mc -filetype=obj -triple=aarch64 symbols.s -o symbols.o
+# RUN: ld.lld rel.o symbols.o -o a.out --section-start=.data=0x100000 --section-start=.text=0x200000
+# RUN: llvm-objdump -s a.out | FileCheck %s --check-prefix=DATA
+# RUN: llvm-objdump -d a.out | FileCheck %s --check-prefix=CODE
+
+#--- symbols.s
+
+// Source file containing the values of target symbols for the relocations. If
+// we don't keep these in their own file, then llvm-mc is clever enough to
+// resolve some of the relocations during assembly, even though they're written
+// as explicit .reloc directives. But we want the relocations to be present in
+// the object file, so that yaml2obj can change their type and we can test
+// lld's handling of the result. So we ensure that llvm-mc can't see both the
+// .reloc and the target symbol value at the same time.
+
+.globl abs16
+.globl abs32
+.globl abs64
+.globl big64
+.globl pcrel
+.globl data
+.globl branchtarget
+.globl calltarget
+
+.equ abs16, 0x9999
+.equ data, 0x100000
+.equ branchtarget, 0x200100
+.equ calltarget, 0x02000100
+.equ pcrel, 0x245678
+.equ abs32, 0x88888888
+.equ abs64, 0x7777777777777777
+.equ big64, 0x77ffffffffffff77
+
+#--- relocs.s
+
+// Source file containing the test instructions and their relocations, with the
+// FileCheck comments interleaved.
+
+// DATA: Contents of section .data:
+.data
+
+// First test absolute data relocations. For each one I show the expected
+// value in a comment, and then expect a line in llvm-objdump -s containing
+// all the values together.
+
+        // 0x7777777777777777 + 0x1234567887654321 = 0x89abcdeffedcba98
+        .reloc ., R_AARCH64_ABS64, abs64
+        .xword 0x1234567887654321
+
+        // 0x88888888 + 0x12344321 = 0x9abccba9
+        .reloc ., R_AARCH64_ABS32, abs32
+        .word 0x12344321
+
+        // 0x9999 + 0x1234 = 0xabcd
+        .reloc ., R_AARCH64_ABS16, abs16
+        .hword 0x1234
+
+        // DATA-NEXT:  100000 98badcfe efcdab89 a9cbbc9a cdab
+
+        .balign 16
+
+// Test relative data relocs, each subtracting the address of the relocated
+// word.
+
+        // 0x100000 + 0x1234567887654321 - 0x100010 = 0x1234567887654311
+        .reloc ., R_AARCH64_PREL64, data
+        .xword 0x1234567887654321
+
+        // 0x100000 + 0x12344321 - 0x100018 = 0x12344309
+        .reloc ., R_AARCH64_PREL32, data
+        .word 0x12344321
+
+        // 0x100000 + 0x1234 - 0x10001c = 0x1218
+        .reloc ., R_AARCH64_PREL16, data
+        .hword 0x1234
+
+        // DATA-NEXT:  100010 11436587 78563412 09433412 1812
+
+// CODE: 0000000000200000 <_start>:
+.text
+.globl _start
+_start:
+
+// Full set of 4 instructions loading the constant 'abs64' and adding 0x1234 to
+// it.
+
+// Expected constant is 0x7777777777777777 + 0x1234 = 0x77777777777789ab
+
+        .reloc ., R_AARCH64_MOVW_UABS_G0_NC, abs64
+        movz x0, #0x1234
+        // CODE-NEXT:  200000: d2913560      mov     x0, #0x89ab
+        .reloc ., R_AARCH64_MOVW_UABS_G1_NC, abs64
+        movk x0, #0x1234, lsl #16
+        // CODE-NEXT:  200004: f2aeeee0      movk    x0, #0x7777, lsl #16
+        .reloc ., R_AARCH64_MOVW_UABS_G2_NC, abs64
+        movk x0, #0x1234, lsl #32
+        // CODE-NEXT:  200008: f2ceeee0      movk    x0, #0x7777, lsl #32
+        .reloc ., R_AARCH64_MOVW_UABS_G3,    abs64
+        movk x0, #0x1234, lsl #48
+        // CODE-NEXT:  20000c: f2eeeee0      movk    x0, #0x7777, lsl #48
+
+// The same, but this constant has ffff in the middle 32 bits, forcing carries
+// to be propagated.
+
+// Expected constant: 0x77ffffffffffff77 + 0x1234 = 0x78000000000011ab
+
+        .reloc ., R_AARCH64_MOVW_UABS_G0_NC, big64
+        movz x0, #0x1234
+        // CODE-NEXT:  200010: d2823560      mov     x0, #0x11ab            
+        .reloc ., R_AARCH64_MOVW_UABS_G1_NC, big64
+        movk x0, #0x1234, lsl #16
+        // CODE-NEXT:  200014: f2a00000      movk    x0, #0x0, lsl #16
+        .reloc ., R_AARCH64_MOVW_UABS_G2_NC, big64
+        movk x0, #0x1234, lsl #32
+        // CODE-NEXT:  200018: f2c00000      movk    x0, #0x0, lsl #32
+        .reloc ., R_AARCH64_MOVW_UABS_G3,    big64
+        movk x0, #0x1234, lsl #48
+        // CODE-NEXT:  20001c: f2ef0000      movk    x0, #0x7800, lsl #48
+
+// Demonstrate that offsets are treated as signed: this one is taken to be
+// -0x1234. (If it were +0xedcc then you'd be able to tell the 
diff erence by
+// the carry into the second halfword.)
+
+// Expected value: 0x7777777777777777 - 0x1234 = 0x7777777777776543
+
+        .reloc ., R_AARCH64_MOVW_UABS_G0_NC, abs64
+        movz x0, #0xedcc
+        // CODE-NEXT:  200020: d28ca860      mov     x0, #0x6543
+        .reloc ., R_AARCH64_MOVW_UABS_G1_NC, abs64
+        movk x0, #0xedcc, lsl #16
+        // CODE-NEXT:  200024: f2aeeee0      movk    x0, #0x7777, lsl #16
+        .reloc ., R_AARCH64_MOVW_UABS_G2_NC, abs64
+        movk x0, #0xedcc, lsl #32
+        // CODE-NEXT:  200028: f2ceeee0      movk    x0, #0x7777, lsl #32
+        .reloc ., R_AARCH64_MOVW_UABS_G3,    abs64
+        movk x0, #0xedcc, lsl #48
+        // CODE-NEXT:  20002c: f2eeeee0      movk    x0, #0x7777, lsl #48
+
+// Check various bits of the ADR immediate, including in particular the low 2
+// bits, which are not contiguous with the rest in the encoding.
+//
+// These values are all 0x245678 + 2^n, except the last one, where the set bit
+// of the addend is the top bit, counting as negative, i.e. we expect the value
+// 0x254678 - 0x100000 = 0x145678.
+
+        .reloc ., R_AARCH64_ADR_PREL_LO21, pcrel
+        adr x0, .+1
+        // CODE-NEXT:  200030: 3022b240      adr     x0, 0x245679
+        .reloc ., R_AARCH64_ADR_PREL_LO21, pcrel
+        adr x0, .+2
+        // CODE-NEXT:  200034: 5022b220      adr     x0, 0x24567a
+        .reloc ., R_AARCH64_ADR_PREL_LO21, pcrel
+        adr x0, .+4
+        // CODE-NEXT:  200038: 1022b220      adr     x0, 0x24567c
+        .reloc ., R_AARCH64_ADR_PREL_LO21, pcrel
+        adr x0, .+8
+        // CODE-NEXT:  20003c: 1022b220      adr     x0, 0x245680
+        .reloc ., R_AARCH64_ADR_PREL_LO21, pcrel
+        adr x0, .+1<<19
+        // CODE-NEXT:  200040: 1062b1c0      adr     x0, 0x2c5678
+        .reloc ., R_AARCH64_ADR_PREL_LO21, pcrel
+        adr x0, .-1<<20
+        // CODE-NEXT:  200044: 10a2b1a0      adr     x0, 0x145678
+
+// Now load the same set of values with ADRP+ADD. But because the real ADRP
+// instruction shifts its immediate, we must account for that.
+
+        .reloc ., R_AARCH64_ADR_PREL_PG_HI21, pcrel
+        adrp x0, 1<<12
+        // CODE-NEXT:  200048: b0000220      adrp    x0, 0x245000
+        .reloc ., R_AARCH64_ADD_ABS_LO12_NC,  pcrel
+        add x0, x0, #1
+        // CODE-NEXT:  20004c: 9119e400      add     x0, x0, #0x679
+        .reloc ., R_AARCH64_ADR_PREL_PG_HI21, pcrel
+        adrp x0, 2<<12
+        // CODE-NEXT:  200050: b0000220      adrp    x0, 0x245000
+        .reloc ., R_AARCH64_ADD_ABS_LO12_NC,  pcrel
+        add x0, x0, #2
+        // CODE-NEXT:  200054: 9119e800      add     x0, x0, #0x67a
+        .reloc ., R_AARCH64_ADR_PREL_PG_HI21, pcrel
+        adrp x0, 4<<12
+        // CODE-NEXT:  200058: b0000220      adrp    x0, 0x245000
+        .reloc ., R_AARCH64_ADD_ABS_LO12_NC,  pcrel
+        add x0, x0, #4
+        // CODE-NEXT:  20005c: 9119f000      add     x0, x0, #0x67c
+        .reloc ., R_AARCH64_ADR_PREL_PG_HI21, pcrel
+        adrp x0, 8<<12
+        // CODE-NEXT:  200060: b0000220      adrp    x0, 0x245000
+        .reloc ., R_AARCH64_ADD_ABS_LO12_NC,  pcrel
+        add x0, x0, #8
+        // CODE-NEXT:  200064: 911a0000      add     x0, x0, #0x680
+
+        // Starting here, the high bits won't fit in the ADD immediate, so that
+        // becomes 0, and only the ADRP immediate shows evidence of the addend.
+
+        .reloc ., R_AARCH64_ADR_PREL_PG_HI21, pcrel
+        adrp x0, 1<<(19+12)
+        // CODE-NEXT:  200068: b0000620      adrp    x0, 0x2c5000
+        .reloc ., R_AARCH64_ADD_ABS_LO12_NC,  pcrel
+        add x0, x0, #0
+        // CODE-NEXT:  20006c: 9119e000      add     x0, x0, #0x678
+
+        .reloc ., R_AARCH64_ADR_PREL_PG_HI21, pcrel
+        adrp x0, -1<<(20+12)
+        // CODE-NEXT:  200070: b0fffa20      adrp    x0, 0x145000
+        .reloc ., R_AARCH64_ADD_ABS_LO12_NC,  pcrel
+        add x0, x0, #0
+        // CODE-NEXT:  200074: 9119e000      add     x0, x0, #0x678
+
+        // Finally, an example with a full 21-bit addend.
+        // Expected value = 0x245678 + 0xfedcb - 0x100000 = 0x244443
+        .reloc ., R_AARCH64_ADR_PREL_PG_HI21, pcrel
+        adrp x0, (0xfedcb-0x100000)<<12
+        // CODE-NEXT:  200078: 90000220      adrp    x0, 0x244000
+        .reloc ., R_AARCH64_ADD_ABS_LO12_NC,  pcrel
+        add x0, x0, #0xdcb
+        // CODE-NEXT:  20007c: 91110c00      add     x0, x0, #0x443
+
+// PC-relative loads, in which the 19-bit offset is shifted. The offsets are
+// the same as the ADRs above, except for the first two, which can't be
+// expressed by pc-relative LDR with an offset shifted left 2.
+//
+// (The input syntax is confusing here. I'd normally expect to write this as
+// `ldr x0, [pc, #offset]`, but LLVM writes just `#offset`.)
+
+        .reloc ., R_AARCH64_LD_PREL_LO19,     pcrel
+        ldr w0, #4
+        // CODE-NEXT:  200080: 1822afe0      ldr     w0, 0x24567c
+        .reloc ., R_AARCH64_LD_PREL_LO19,     pcrel
+        ldr w0, #8
+        // CODE-NEXT:  200084: 1822afe0      ldr     w0, 0x245680
+        .reloc ., R_AARCH64_LD_PREL_LO19,     pcrel
+        ldr w0, #1<<19
+        // CODE-NEXT:  200088: 1862af80      ldr     w0, 0x2c5678
+        .reloc ., R_AARCH64_LD_PREL_LO19,     pcrel
+        ldr w0, #-1<<20
+        // CODE-NEXT:  20008c: 18a2af60      ldr     w0, 0x145678
+
+
+// For these, the branch target is 0x200100 plus powers of 2, except the offset
+// 2^15, which is negative, because the addend is treated as signed.
+
+        .reloc ., R_AARCH64_TSTBR14, branchtarget
+        tbnz x1, #63, #4
+        // CODE-NEXT:  200090: b7f803a1      tbnz    x1, #0x3f, 0x200104
+        .reloc ., R_AARCH64_TSTBR14, branchtarget
+        tbnz x1, #62, #8
+        // CODE-NEXT:  200094: b7f003a1      tbnz    x1, #0x3e, 0x200108
+        .reloc ., R_AARCH64_TSTBR14, branchtarget
+        tbnz x1, #61, #1<<14
+        // CODE-NEXT:  200098: b7ea0341      tbnz    x1, #0x3d, 0x204100
+        .reloc ., R_AARCH64_TSTBR14, branchtarget
+        tbnz x1, #60, #-1<<15
+        // CODE-NEXT:  20009c: b7e40321      tbnz    x1, #0x3c, 0x1f8100
+
+// CONDBR19 is used for both cbz/cbnz and B.cond, so test both at once. Base
+// offset is the same again (from 0x200100), but this time, offsets can go up
+// to 2^20.
+
+        .reloc ., R_AARCH64_CONDBR19, branchtarget
+        cbnz x2, #4
+        // CODE-NEXT:  2000a0: b5000322      cbnz    x2, 0x200104
+        .reloc ., R_AARCH64_CONDBR19, branchtarget
+        b.eq #8
+        // CODE-NEXT:  2000a4: 54000320      b.eq    0x200108
+        .reloc ., R_AARCH64_CONDBR19, branchtarget
+        cbz x2, #1<<19
+        // CODE-NEXT:  2000a8: b44002c2      cbz     x2, 0x280100
+        .reloc ., R_AARCH64_CONDBR19, branchtarget
+        b.vs #-1<<20
+        // CODE-NEXT:  2000ac: 548002a6      b.vs    0x100100
+
+// And for BL and B, the offsets go up to 2^25.
+
+        .reloc ., R_AARCH64_CALL26, calltarget
+        bl #4
+        // CODE-NEXT:  2000b0: 94780015      bl      0x2000104
+        .reloc ., R_AARCH64_CALL26, calltarget
+        bl #8
+        // CODE-NEXT:  2000b4: 94780015      bl      0x2000108
+        .reloc ., R_AARCH64_CALL26, calltarget
+        bl #1<<24
+        // CODE-NEXT:  2000b8: 94b80012      bl      0x3000100
+        .reloc ., R_AARCH64_CALL26, calltarget
+        bl #-1<<25
+        // CODE-NEXT:  2000bc: 97f80011      bl      0x100
+
+        .reloc ., R_AARCH64_JUMP26, calltarget
+        b #4
+        // CODE-NEXT:  2000c0: 14780011      b       0x2000104
+        .reloc ., R_AARCH64_JUMP26, calltarget
+        b #8
+        // CODE-NEXT:  2000c4: 14780011      b       0x2000108
+        .reloc ., R_AARCH64_JUMP26, calltarget
+        b #1<<24
+        // CODE-NEXT:  2000c8: 14b8000e      b       0x3000100
+        .reloc ., R_AARCH64_JUMP26, calltarget
+        b #-1<<25
+        // CODE-NEXT:  2000cc: 17f8000d      b       0x100