[llvm] 96d40df - MCExpr::evaluateAsRelocatableImpl : allow evaluation of non-VK_None MCSymbolRefExpr when MCAsmLayout is available

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 18 13:52:39 PST 2020


Author: Fangrui Song
Date: 2020-11-18T13:52:33-08:00
New Revision: 96d40df71ecee07c69aea512f6c04fc4fbe6acfb

URL: https://github.com/llvm/llvm-project/commit/96d40df71ecee07c69aea512f6c04fc4fbe6acfb
DIFF: https://github.com/llvm/llvm-project/commit/96d40df71ecee07c69aea512f6c04fc4fbe6acfb.diff

LOG: MCExpr::evaluateAsRelocatableImpl : allow evaluation of non-VK_None MCSymbolRefExpr when MCAsmLayout is available

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=4acf8c78e659833be8be047ba2f8561386a11d4b
(1994) introduced this behavior:
if a fixup symbol is equated to an expression with an undefined symbol, convert
the fixup to be against the target symbol. glibc relies on this behavior to perform
assembly level indirection

```
asm("memcpy = __GI_memcpy"); // from sysdeps/generic/symbol-hacks.h

...
  // call memcpy at PLT
  // The relocation references __GI_memcpy in GNU as, but memcpy in MC (without the patch)
  memcpy (...);
```

(1) It complements `extern __typeof(memcpy) memcpy asm("__GI_memcpy");` The frontend asm label does not redirect synthesized memcpy in the middle-end. (See D88712 for details)
(2) `asm("memcpy = __GI_memcpy");` is in every translation unit, but the memcpy declaration may not be visible in the translation unit where memcpy is synthesized.

MC already redirects `memcpy = __GI_memcpy; call memcpy` but not `memcpy = __GI_memcpy; call memcpy at plt`.
This patch fixes the latter by allowing MCExpr::evaluateAsRelocatableImpl to
evaluate a non-VK_None MCSymbolRefExpr, which is only done after the layout is available.

GNU as allows `memcpy = __GI_memcpy+1; call memcpy at PLT` which seems nonsensical, so we don't allow it.

`MC/PowerPC/pr38945.s` `NUMBER = 0x6ffffff9; cmpwi 8,NUMBER at l` requires the
`symbol at l` form in AsmMatcher, so evaluation needs to be deferred. This is the
place whether future simplification may be possible.

Note, if we suppress the VM_None evaluation when MCAsmLayout is nullptr, we may
lose the `invalid reassignment of non-absolute variable` diagnostic
(`ARM/thumb_set-diagnostics.s` and `MC/AsmParser/variables-invalid.s`).
We know that this diagnostic is troublesome in some cases
(https://github.com/ClangBuiltLinux/linux/issues/1008), so we can consider
making simplification in the future.

Reviewed By: jyknight

Differential Revision: https://reviews.llvm.org/D88625

Added: 
    llvm/test/MC/ELF/relocation-alias.s

Modified: 
    llvm/lib/MC/MCExpr.cpp
    llvm/test/MC/ARM/ehabi-personality-abs.s
    llvm/test/MC/ELF/relocation.s

Removed: 
    


################################################################################
diff  --git a/llvm/lib/MC/MCExpr.cpp b/llvm/lib/MC/MCExpr.cpp
index 5326da653999..6f1ac656cec0 100644
--- a/llvm/lib/MC/MCExpr.cpp
+++ b/llvm/lib/MC/MCExpr.cpp
@@ -802,13 +802,30 @@ bool MCExpr::evaluateAsRelocatableImpl(MCValue &Res, const MCAssembler *Asm,
   case SymbolRef: {
     const MCSymbolRefExpr *SRE = cast<MCSymbolRefExpr>(this);
     const MCSymbol &Sym = SRE->getSymbol();
+    const auto Kind = SRE->getKind();
 
     // Evaluate recursively if this is a variable.
-    if (Sym.isVariable() && SRE->getKind() == MCSymbolRefExpr::VK_None &&
+    if (Sym.isVariable() && (Kind == MCSymbolRefExpr::VK_None || Layout) &&
         canExpand(Sym, InSet)) {
       bool IsMachO = SRE->hasSubsectionsViaSymbols();
       if (Sym.getVariableValue()->evaluateAsRelocatableImpl(
               Res, Asm, Layout, Fixup, Addrs, InSet || IsMachO)) {
+        if (Kind != MCSymbolRefExpr::VK_None) {
+          if (Res.isAbsolute()) {
+            Res = MCValue::get(SRE, nullptr, 0);
+            return true;
+          }
+          // If the reference has a variant kind, we can only handle expressions
+          // which evaluate exactly to a single unadorned symbol. Attach the
+          // original VariantKind to SymA of the result.
+          if (Res.getRefKind() != MCSymbolRefExpr::VK_None || !Res.getSymA() ||
+              Res.getSymB() || Res.getConstant())
+            return false;
+          Res =
+              MCValue::get(MCSymbolRefExpr::create(&Res.getSymA()->getSymbol(),
+                                                   Kind, Asm->getContext()),
+                           Res.getSymB(), Res.getConstant(), Res.getRefKind());
+        }
         if (!IsMachO)
           return true;
 

diff  --git a/llvm/test/MC/ARM/ehabi-personality-abs.s b/llvm/test/MC/ARM/ehabi-personality-abs.s
index 051b837c698a..f1298f8f5eef 100644
--- a/llvm/test/MC/ARM/ehabi-personality-abs.s
+++ b/llvm/test/MC/ARM/ehabi-personality-abs.s
@@ -9,5 +9,6 @@ f:
 	bx lr
 	.fnend
 
+@@ Regression test: MC does not crash due to the absolute __aeabi_unwind_cpp_pr0.
+@@ GNU as and MC currently emit a R_ARM_NONE for this invalid usage.
 @ CHECK: R_ARM_NONE __aeabi_unwind_cpp_pr0
-

diff  --git a/llvm/test/MC/ELF/relocation-alias.s b/llvm/test/MC/ELF/relocation-alias.s
new file mode 100644
index 000000000000..d8392a9a4100
--- /dev/null
+++ b/llvm/test/MC/ELF/relocation-alias.s
@@ -0,0 +1,55 @@
+# RUN: llvm-mc -filetype=obj -triple x86_64 %s -o %t
+# RUN: llvm-objdump -dr %t | FileCheck %s
+# RUN: llvm-readelf -s %t | FileCheck %s --check-prefix=SYM
+
+# RUN: not llvm-mc -filetype=obj -triple x86_64 --defsym ERR=1 %s 2>&1 | FileCheck %s --check-prefix=ERR
+
+## If a fixup symbol is equated to an undefined symbol, convert the fixup
+## to be against the target symbol, even if there is a variant (@PLT).
+# CHECK:      callq {{.*}}
+# CHECK-NEXT:   R_X86_64_PLT32  __GI_memcpy-0x4
+# CHECK:      movabsq $0, %rax
+# CHECK-NEXT:   R_X86_64_64  __GI_memcpy+0x2
+memcpy = __GI_memcpy
+call memcpy at PLT
+movabsq $memcpy+2, %rax
+
+# CHECK:      movq (%rip), %rax
+# CHECK-NEXT:   R_X86_64_REX_GOTPCRELX  abs-0x4
+movq abs at GOTPCREL(%rip), %rax
+abs = 42
+
+# CHECK:      movabsq $0, %rbx
+# CHECK-NEXT:   R_X86_64_64  data_alias
+.globl data_alias
+.set data_alias, data
+movabsq $data_alias, %rbx
+
+## A local alias to a defined symbol still references a section symbol.
+# CHECK:      movabsq $0, %rbx
+# CHECK-NEXT:   R_X86_64_64  .data+0x1
+.set data_alias_l, data
+movabsq $data_alias_l, %rbx
+
+.data
+.byte 0
+.globl data
+data:
+
+.ifdef ERR
+.text
+## Note, GNU as emits a relocation for this erroneous fixup.
+# ERR: {{.*}}.s:[[#@LINE+2]]:1: error: expected relocatable expression
+memcpy_plus_1 = __GI_memcpy+1
+call memcpy_plus_1 at PLT
+.endif
+
+## Redirected symbols do not have a symbol table entry.
+# SYM:      NOTYPE  LOCAL  DEFAULT UND
+# SYM-NEXT: NOTYPE  LOCAL  DEFAULT ABS abs
+# SYM-NEXT: NOTYPE  LOCAL  DEFAULT   4 data_alias_l
+# SYM-NEXT: SECTION LOCAL  DEFAULT   4 .data
+# SYM-NEXT: NOTYPE  GLOBAL DEFAULT UND __GI_memcpy
+# SYM-NEXT: NOTYPE  GLOBAL DEFAULT   4 data
+# SYM-NEXT: NOTYPE  GLOBAL DEFAULT   4 data_alias
+# SYM-NOT:  {{.}}

diff  --git a/llvm/test/MC/ELF/relocation.s b/llvm/test/MC/ELF/relocation.s
index d55e8acffc1d..8802330c90b7 100644
--- a/llvm/test/MC/ELF/relocation.s
+++ b/llvm/test/MC/ELF/relocation.s
@@ -32,10 +32,7 @@ bar:
         movabsq	$baz at TPOFF, %rax
 	.word   foo-bar
 	.byte   foo-bar
-
-        # this should probably be an error...
-	zed = foo +2
-	call zed at PLT
+	call foo
 
         leaq    -1+foo(%rip), %r11
 
@@ -94,7 +91,7 @@ weak_sym:
 // CHECK-NEXT:       0x85 R_X86_64_TPOFF64 baz 0x0
 // CHECK-NEXT:       0x8D R_X86_64_PC16 foo 0x8D
 // CHECK-NEXT:       0x8F R_X86_64_PC8 foo 0x8F
-// CHECK-NEXT:       0x91 R_X86_64_PLT32 zed 0xFFFFFFFFFFFFFFFC
+// CHECK-NEXT:       0x91 R_X86_64_PLT32 foo 0xFFFFFFFFFFFFFFFC
 // CHECK-NEXT:       0x98 R_X86_64_PC32 foo 0xFFFFFFFFFFFFFFFB
 // CHECK-NEXT:       0x9F R_X86_64_GOTPC32 _GLOBAL_OFFSET_TABLE_ 0x3
 // CHECK-NEXT:       0xA6 R_X86_64_GOTPC32 _GLOBAL_OFFSET_TABLE_ 0xFFFFFFFFFFFFFFFC


        


More information about the llvm-commits mailing list