[lld] r315548 - Rewrite comment.
Rui Ueyama via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 11 19:09:11 PDT 2017
Author: ruiu
Date: Wed Oct 11 19:09:11 2017
New Revision: 315548
URL: http://llvm.org/viewvc/llvm-project?rev=315548&view=rev
Log:
Rewrite comment.
Modified:
lld/trunk/ELF/Arch/X86.cpp
Modified: lld/trunk/ELF/Arch/X86.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/X86.cpp?rev=315548&r1=315547&r2=315548&view=diff
==============================================================================
--- lld/trunk/ELF/Arch/X86.cpp (original)
+++ lld/trunk/ELF/Arch/X86.cpp Wed Oct 11 19:09:11 2017
@@ -87,17 +87,41 @@ RelExpr X86::getRelExpr(RelType Type, co
return R_GOT;
case R_386_GOT32:
case R_386_GOT32X:
- // These relocations can be calculated in two different ways.
- // Usual calculation is G + A - GOT what means an offset in GOT table
- // (R_GOT_FROM_END). When instruction pointed by relocation has no base
- // register, then relocations can be used when PIC code is disabled. In that
- // case calculation is G + A, it resolves to an address of entry in GOT
- // (R_GOT) and not an offset.
- //
- // To check that instruction has no base register we scan ModR/M byte.
- // See "Table 2-2. 32-Bit Addressing Forms with the ModR/M Byte"
- // (http://www.intel.com/content/dam/www/public/us/en/documents/manuals/
- // 64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf)
+ // These relocations are arguably mis-designed because their calculations
+ // depend on the instructions they are applied to. This is bad because we
+ // usually don't care about whether the target section contains valid
+ // machine instructions or not. But this is part of the documented ABI, so
+ // we had to implement as the standard requires.
+ //
+ // x86 does not support PC-relative data access. Therefore, in order to
+ // access GOT contents, a GOT address needs to be known at link-time
+ // (which means non-PIC) or compilers have to emit code to get a GOT
+ // address at runtime (which means code is position-independent but
+ // compilers need to emit extra code for each GOT access.) This decision
+ // is made at compile-time. In the latter case, compilers emit code to
+ // load an GOT address to a register, which is usually %ebx.
+ //
+ // So, there are two ways to refer to symbol foo's GOT entry: foo at GOT or
+ // foo at GOT(%reg).
+ //
+ // foo at GOT is not usable in PIC. If we are creating a PIC output and if we
+ // find such relocation, we should report an error. foo at GOT is resolved to
+ // an *absolute* address of foo's GOT entry, because both GOT address and
+ // foo's offset are known. In other words, it's G + A.
+ //
+ // foo at GOT(%reg) needs to be resolved to a *relative* offset from a GOT to
+ // foo's GOT entry in the table, because GOT address is not known but foo's
+ // offset in the table is known. It's G + A - GOT.
+ //
+ // It's unfortunate that compilers emit the same relocation for these
+ // different use cases. In order to distinguish them, we have to read a
+ // machine instruction.
+ //
+ // The following code implements it. We assume that Loc[0] is the first
+ // byte of a displacement or an immediate field of a valid machine
+ // instruction. That means a ModRM byte is at Loc[-1]. By taking a look at
+ // the byte, we can determine whether the instruction is register-relative
+ // (i.e. it was generated for foo at GOT(%reg)) or absolute (i.e. foo at GOT).
if ((Loc[-1] & 0xc7) != 0x5)
return R_GOT_FROM_END;
if (Config->Pic)
More information about the llvm-commits
mailing list