[PATCH] D103001: [ELF][RISCV] Resolve branch relocations referencing undefined weak to current location if not using PLT

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun May 23 20:55:04 PDT 2021


MaskRay created this revision.
MaskRay added reviewers: bsdjhb, jrtc27, PkmX.
Herald added subscribers: vkmr, frasercrmck, evandro, luismarques, apazos, sameer.abuasal, pengfei, s.egerton, Jim, benna, psnobl, jocewei, the_o, brucehoult, MartinMosbeck, rogfer01, atanasyan, edward-jones, zzheng, shiva0217, kito-cheng, niosHD, sabuasal, simoncook, johnrusso, rbar, asb, kristof.beyls, arichardson, sdardis, emaste.
MaskRay requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

In a -no-pie link we optimize R_PLT_PC to R_PC. Currently we resolve a branch
relocation to the link-time zero address. However such a choice tends to cause
relocation overflow possibility for RISC architectures.

- aarch64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the next instruction
- mips: GNU ld: jump to the start of the text segment (?); ld.lld: branch to zero
- ppc32: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
- ppc64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
- riscv: GNU ld: branch to the absolute zero address (with instruction rewriting)
- i386/x86_64: GNU ld/ld.lld: branch to the link-time zero address

I think that resolving to the same location is a good choice. The instruction,
if triggered, is clearly an undefined behavior. Resolving to the same location
can cause an infinite loop (making the user aware of the issue) while ensuring
no overflow.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D103001

Files:
  lld/ELF/InputSection.cpp
  lld/test/ELF/riscv-undefined-weak.s


Index: lld/test/ELF/riscv-undefined-weak.s
===================================================================
--- lld/test/ELF/riscv-undefined-weak.s
+++ lld/test/ELF/riscv-undefined-weak.s
@@ -50,19 +50,26 @@
 # RELOC-NEXT: 0x20 R_RISCV_JAL target 0x0
 
 # PC-LABEL:    <branch>:
-# PC-NEXT:     auipc ra, 1048559
-# PC-NEXT:     jalr -368(ra)
-# PC-NEXT:     j 0x0
+# PC-NEXT:     auipc ra, 0
+# PC-NEXT:     jalr ra
+# PC-NEXT:     [[#%x,ADDR:]]:
+# PC-SAME:                    j 0x[[#ADDR]]
+# PC-NEXT:     [[#%x,ADDR:]]:
+# PC-SAME:                    beqz zero, 0x[[#ADDR]]
 
 ## If .dynsym exists, an undefined weak symbol is preemptible.
 ## We create a PLT entry and redirect the reference to it.
 # PLT-LABEL:   <branch>:
 # PLT-NEXT:    auipc ra, 0
 # PLT-NEXT:    jalr 56(ra)
-# PLT-NEXT:    j 0x0
+# PLT-NEXT:    [[#%x,ADDR:]]:
+# PLT-SAME:                   j 0x[[#ADDR]]
+# PLT-NEXT:    [[#%x,ADDR:]]:
+# PLT-SAME:                   beqz zero, 0x[[#ADDR]]
 branch:
   call target
   jal x0, target
+  beq x0, x0, target
 
 ## Absolute relocations are resolved to 0.
 # RELOC:      0x0 R_RISCV_64 target 0x3
Index: lld/ELF/InputSection.cpp
===================================================================
--- lld/ELF/InputSection.cpp
+++ lld/ELF/InputSection.cpp
@@ -569,6 +569,18 @@
   llvm_unreachable("AArch64 pc-relative relocation expected\n");
 }
 
+static uint64_t getRISCVUndefinedRelativeWeakVA(uint64_t type, uint64_t p) {
+  switch (type) {
+  case R_RISCV_BRANCH:
+  case R_RISCV_JAL:
+  case R_RISCV_CALL:
+  case R_RISCV_CALL_PLT:
+    return p;
+  default:
+    return 0;
+  }
+}
+
 // ARM SBREL relocations are of the form S + A - B where B is the static base
 // The ARM ABI defines base to be "addressing origin of the output segment
 // defining the symbol S". We defined the "addressing origin"/static base to be
@@ -765,14 +777,18 @@
       // Some PC relative ARM (Thumb) relocations align down the place.
       p = p & 0xfffffffc;
     if (sym.isUndefWeak()) {
-      // On ARM and AArch64 a branch to an undefined weak resolves to the
-      // next instruction, otherwise the place.
+      // On ARM and AArch64 a branch to an undefined weak resolves to the next
+      // instruction, otherwise the place. On RISCV, resolve an undefined weak
+      // to the same instruction to cause an infinite loop (making the user
+      // aware of the issue) while ensuring no overflow.
       if (config->emachine == EM_ARM)
         dest = getARMUndefinedRelativeWeakVA(type, a, p);
       else if (config->emachine == EM_AARCH64)
         dest = getAArch64UndefinedRelativeWeakVA(type, a, p);
       else if (config->emachine == EM_PPC)
         dest = p;
+      else if (config->emachine == EM_RISCV)
+        dest = getRISCVUndefinedRelativeWeakVA(type, p) + a;
       else
         dest = sym.getVA(a);
     } else {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D103001.347303.patch
Type: text/x-patch
Size: 2858 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210524/1e69f5ef/attachment.bin>


More information about the llvm-commits mailing list