[lld] c03b630 - [ELF][RISCV] Resolve branch relocations referencing undefined weak to current location if not using PLT

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 10 13:25:21 PDT 2021


Author: Fangrui Song
Date: 2021-06-10T13:25:16-07:00
New Revision: c03b6305d8419fda84a67f4fe357b69a86e4b54f

URL: https://github.com/llvm/llvm-project/commit/c03b6305d8419fda84a67f4fe357b69a86e4b54f
DIFF: https://github.com/llvm/llvm-project/commit/c03b6305d8419fda84a67f4fe357b69a86e4b54f.diff

LOG: [ELF][RISCV] Resolve branch relocations referencing undefined weak to current location if not using PLT

In a -no-pie link we optimize R_PLT_PC to R_PC. Currently we resolve a branch
relocation to the link-time zero address. However such a choice tends to cause
relocation overflow possibility for RISC architectures.

* aarch64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the next instruction
* mips: GNU ld: branch to the start of the text segment (?); ld.lld: branch to zero
* ppc32: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
* ppc64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
* riscv: GNU ld: branch to the absolute zero address (with instruction rewriting)
* i386/x86_64: GNU ld/ld.lld: branch to the link-time zero address

I think that resolving to the same location is a good choice. The instruction,
if triggered, is clearly an undefined behavior. Resolving to the same location
can cause an infinite loop (making the user aware of the issue) while ensuring
no overflow.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D103001

Added: 
    

Modified: 
    lld/ELF/InputSection.cpp
    lld/test/ELF/riscv-undefined-weak.s

Removed: 
    


################################################################################
diff  --git a/lld/ELF/InputSection.cpp b/lld/ELF/InputSection.cpp
index 143a183e35cd3..67e0e4126ece6 100644
--- a/lld/ELF/InputSection.cpp
+++ b/lld/ELF/InputSection.cpp
@@ -569,6 +569,20 @@ static uint64_t getAArch64UndefinedRelativeWeakVA(uint64_t type, uint64_t a,
   llvm_unreachable("AArch64 pc-relative relocation expected\n");
 }
 
+static uint64_t getRISCVUndefinedRelativeWeakVA(uint64_t type, uint64_t p) {
+  switch (type) {
+  case R_RISCV_BRANCH:
+  case R_RISCV_JAL:
+  case R_RISCV_CALL:
+  case R_RISCV_CALL_PLT:
+  case R_RISCV_RVC_BRANCH:
+  case R_RISCV_RVC_JUMP:
+    return p;
+  default:
+    return 0;
+  }
+}
+
 // ARM SBREL relocations are of the form S + A - B where B is the static base
 // The ARM ABI defines base to be "addressing origin of the output segment
 // defining the symbol S". We defined the "addressing origin"/static base to be
@@ -765,14 +779,18 @@ uint64_t InputSectionBase::getRelocTargetVA(const InputFile *file, RelType type,
       // Some PC relative ARM (Thumb) relocations align down the place.
       p = p & 0xfffffffc;
     if (sym.isUndefWeak()) {
-      // On ARM and AArch64 a branch to an undefined weak resolves to the
-      // next instruction, otherwise the place.
+      // On ARM and AArch64 a branch to an undefined weak resolves to the next
+      // instruction, otherwise the place. On RISCV, resolve an undefined weak
+      // to the same instruction to cause an infinite loop (making the user
+      // aware of the issue) while ensuring no overflow.
       if (config->emachine == EM_ARM)
         dest = getARMUndefinedRelativeWeakVA(type, a, p);
       else if (config->emachine == EM_AARCH64)
         dest = getAArch64UndefinedRelativeWeakVA(type, a, p);
       else if (config->emachine == EM_PPC)
         dest = p;
+      else if (config->emachine == EM_RISCV)
+        dest = getRISCVUndefinedRelativeWeakVA(type, p) + a;
       else
         dest = sym.getVA(a);
     } else {

diff  --git a/lld/test/ELF/riscv-undefined-weak.s b/lld/test/ELF/riscv-undefined-weak.s
index caa5637d18f2c..d0996ab36f289 100644
--- a/lld/test/ELF/riscv-undefined-weak.s
+++ b/lld/test/ELF/riscv-undefined-weak.s
@@ -48,21 +48,29 @@ relative:
 ## Treat them as PC relative relocations.
 # RELOC:      0x18 R_RISCV_CALL target 0x0
 # RELOC-NEXT: 0x20 R_RISCV_JAL target 0x0
+# RELOC-NEXT: 0x24 R_RISCV_BRANCH target 0x0
 
 # PC-LABEL:    <branch>:
-# PC-NEXT:     auipc ra, 1048559
-# PC-NEXT:     jalr -368(ra)
-# PC-NEXT:     j 0x0
+# PC-NEXT:     auipc ra, 0
+# PC-NEXT:     jalr ra
+# PC-NEXT:     [[#%x,ADDR:]]:
+# PC-SAME:                    j 0x[[#ADDR]]
+# PC-NEXT:     [[#%x,ADDR:]]:
+# PC-SAME:                    beqz zero, 0x[[#ADDR]]
 
 ## If .dynsym exists, an undefined weak symbol is preemptible.
 ## We create a PLT entry and redirect the reference to it.
 # PLT-LABEL:   <branch>:
 # PLT-NEXT:    auipc ra, 0
 # PLT-NEXT:    jalr 56(ra)
-# PLT-NEXT:    j 0x0
+# PLT-NEXT:    [[#%x,ADDR:]]:
+# PLT-SAME:                   j 0x[[#ADDR]]
+# PLT-NEXT:    [[#%x,ADDR:]]:
+# PLT-SAME:                   beqz zero, 0x[[#ADDR]]
 branch:
   call target
   jal x0, target
+  beq x0, x0, target
 
 ## Absolute relocations are resolved to 0.
 # RELOC:      0x0 R_RISCV_64 target 0x3


        


More information about the llvm-commits mailing list