[llvm-dev] Do I need to modify the AddrLoc of LLD for ARC target?

Leslie Zhai via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 18 20:28:57 PDT 2017


Hi Peter,

Thanks for your kind response!


在 2017年09月18日 20:44, Peter Smith 写道:
> Hello Leslie,
>
> I don't know quite what to say as I don't know precisely what your
> question is? If I am not being precise enough please can you put some
> explicit questions in? From what I can see in the output, here are
> some comments.
>
>  From your arc mapfiles it looks like that in the output both linker's
> have given the .text output section the correct base address given the
> alignment restrictions as the alignment requirement of .text from
> lib_a-memset-bs.o is 4, therefor the alignment requirement of the
> OutputSection .text should be 4:
> LLD:
> Address    Size              Alignment
> 00000000 00000080     4 .text
> 00000000 00000004     1         basic-arc.o:(.text)
> 00000000 00000000     0                 main
> 00000004 0000007c     4         ... (lib_a-memset-bs.o):(.text)
>
> LD
> .text           0x0000000000000000       0x80
>   *(.text .stub .text.* .gnu.linkonce.t.*)
>   .text          0x0000000000000000        0x4 basic-arc.o
>   .text          0x0000000000000004       0x7c ... libc.a(lib_a-memset-bs.o)
>                  0x0000000000000004                memset
>                  0x0000000000000060                __strncpy_bzero

Reloc type=R_ARC_S25W_PCREL, should_relocate = true
   offset = 0x0, addend = 0x0
  Symbol:
   value = 0x00000000
  Symbol Section:
   section name = .text, output_offset 0x00000006, output_section->vma = 
0x00000006
   file: lib_a-memset-bs.o
  Input_section:
   section name = .text, output_offset 0x00000000, output_section->vma = 
0x00000006
   changed_address = 0x00000006
   file: basic-arc.o
RELOC_TYPE = ARC_S25W_PCREL
FORMULA = ( ME ( ( ( ( S + A ) - P ) >> 2 ) ) )
S = 0xc
A = 0
L = c
symbol_section->vma = 0xc
symbol_section->vma = 0x6
PCL = 0x4
P = 0x4
G = 0
SDA_OFFSET = 0x2188
SDA_SET = 1
GOT_OFFSET = 0
relocation = 0x000002
before = 0x000802
data   = 00000002 (2) (2)
after  = 0x0000080a

then I need to investigate how LD calculate 
reloc_data.input_section->output_section->vma, it might different with 
LLD even the same Alignment 
https://github.com/llvm-mirror/lld/blob/master/ELF/LinkerScript.cpp#L485


>
> I'm not entirely sure where the Arm example has come from, but it does
> show an interesting difference. It looks like the linker's are
> handling the -ttext <address> option slightly differently when the
> <address> of the OutputSection is not 0 modulo OutputSection
> alignment.
>
>  From the map file we can see that lld is aligning the OutputSection to
> the nearest 4-byte boundary, GNU-ld is placing the OutputSection on
> the requested address, but is adding padding before the .text section
> to make sure that in the final executable the InputSection is aligned.
>
> LLD
> Address  Size     Align Out     In      Symbol
> 00011008 00000018     4 .text
> 00011008 00000018     4         arm-thumb-undefined-weak.o:(.text)
> 00011008 00000000     0                 $t.0
> 00011008 00000000     0                 _start
>
> LD
> .text           0x0000000000011006       0x1a
> ...
>   *fill*         0x0000000000011006        0x2
>   .text          0x0000000000011008       0x18 arm-thumb-undefined-weak.o
>
> The *fill* is visible as a nop in the disassembly for the LD produced image.
>
> Strictly speaking I think LD is producing a file that doesn't strictly
> conform to ELF here as the sh_addr of the .text OutputSection is 0
> modulo sh_addralign (4). In practice it probably wouldn't make much
> difference. My preference is for LLD's behaviour here.

It might be arm-linux-gnu toolchain's issue:

$ arm-linux-gnu-gcc -o arm-thumb-undefined-weak-ld.o -c 
arm-thumb-undefined-weak.s

arm-thumb-undefined-weak.s: Assembler messages:
arm-thumb-undefined-weak.s:18: Error: width suffixes are invalid in ARM 
mode -- `beq.w target'
arm-thumb-undefined-weak.s:20: Error: width suffixes are invalid in ARM 
mode -- `b.w target'

then arm-linux-gnu-ld might wrongly relocated R_ARM_THM_CALL for 
arm-thumb-undefined-weak-lld.o generated by llvm-mc.

>
> Peter
>
>
>
> On 18 September 2017 at 03:28, Leslie Zhai <lesliezhai at llvm.org.cn> wrote:
>> Hi Peter,
>>
>> Map file about LD for ARC target
>> https://drive.google.com/open?id=0ByE8c-y74l_uRWpQdUh2c0VXZ1k
>>
>> LLD for ARC https://drive.google.com/open?id=0ByE8c-y74l_ueGVuYkR0a3RSWjQ
>>
>>
>> arm-thumb-undefined-weak.s
>> https://github.com/llvm-mirror/lld/blob/master/test/ELF/arm-thumb-undefined-weak.s
>>
>> $ llvm/build/bin/llvm-mc -filetype=obj -triple=thumbv7a-none-linux-gnueabi
>> arm-thumb-undefined-weak.s -o arm-thumb-undefined-weak.o
>> $ llvm/build/bin/ld.lld -o arm-thumb-undefined-weak-lld
>> arm-thumb-undefined-weak.o -Ttext=11006
>> $ arm-linux-gnu-ld -o arm-thumb-undefined-weak-ld arm-thumb-undefined-weak.o
>> -Ttext=11006
>>
>> $ arm-linux-gnu-readelf -r arm-thumb-undefined-weak.o
>>
>> Relocation section '.rel.text' at offset 0x8c contains 6 entries:
>>   Offset     Info    Type            Sym.Value  Sym. Name
>> 00000000  00000333 R_ARM_THM_JUMP19  00000000   target
>> 00000004  0000031e R_ARM_THM_JUMP24  00000000   target
>> 00000008  0000030a R_ARM_THM_CALL    00000000   target
>> 0000000c  0000030a R_ARM_THM_CALL    00000000   target
>> 00000010  00000332 R_ARM_THM_MOVT_PR 00000000   target
>> 00000014  00000331 R_ARM_THM_MOVW_PR 00000000   target
>>
>>
>> DEBUG: lld: R_ARM_THM_JUMP19 TargetVA: 0 A: -4 P: 69640 Align: 4 VMA: 69640
>> Output Offset: 0 Reloc Offset: 0
>> DEBUG: lld: R_ARM_THM_JUMP24 TargetVA: 0 A: -4 P: 69644 Align: 4 VMA: 69640
>> Output Offset: 0 Reloc Offset: 4
>> DEBUG: lld: R_ARM_THM_CALL TargetVA: 1 A: -4 P: 69648 Align: 4 VMA: 69640
>> Output Offset: 0 Reloc Offset: 8
>> DEBUG: lld: R_ARM_THM_CALL TargetVA: 1 A: -4 P: 69652 Align: 4 VMA: 69640
>> Output Offset: 0 Reloc Offset: 12
>> DEBUG: lld: R_ARM_THM_MOVT_PREL TargetVA: 0 A: 0 P: 69656 Align: 4 VMA:
>> 69640 Output Offset: 0 Reloc Offset: 16
>> DEBUG: lld: R_ARM_THM_MOVW_PREL_NC TargetVA: 0 A: 0 P: 69660 Align: 4 VMA:
>> 69640 Output Offset: 0 Reloc Offset: 20
>>
>> DEBUG: arm-linux-gnu-ld: R_ARM_THM_JUMP19: VMA: 69638 Output Offset: 2 Reloc
>> Offset: 0
>> DEBUG: arm-linux-gnu-ld: R_ARM_THM_JUMP24: VMA: 69638 Output Offset: 2 Reloc
>> Offset: 4
>> DEBUG: arm-linux-gnu-ld: R_ARM_THM_CALL: VMA: 69638 Output Offset: 2 Reloc
>> Offset: 8
>> DEBUG: arm-linux-gnu-ld: R_ARM_THM_CALL: VMA: 69638 Output Offset: 2 Reloc
>> Offset: 12
>> DEBUG: arm-linux-gnu-ld: R_ARM_THM_MOVT_PREL: VMA: 69638 Output Offset: 2
>> Reloc Offset: 16
>> DEBUG: arm-linux-gnu-ld: R_ARM_THM_MOVW_PREL_NC: VMA: 69638 Output Offset: 2
>> Reloc Offset: 20
>>
>>
>> $ llvm/build/bin/llvm-objdump -triple=thumbv7a-none-linux-gnueabi -d
>> arm-thumb-undefined-weak-lld
>>
>> arm-thumb-undefined-weak-lld:   file format ELF32-arm-little
>>
>> Disassembly of section .text:
>> _start:
>>     11008:       00 f0 00 80     beq.w   #0 <_start+0x4>
>>     1100c:       00 f0 00 b8     b.w     #0 <_start+0x8>
>>     11010:       00 f0 00 f8     bl      #0
>>     11014:       00 f0 00 f8     bl      #0
>>     11018:       c0 f2 00 00     movt    r0, #0
>>     1101c:       40 f2 00 00     movw    r0, #0

My question: why LD's relocation is different from LLD? and thanks for 
your explanation :)

>>
>>
>> $ llvm/build/bin/llvm-objdump -triple=thumbv7a-none-linux-gnueabi -d
>> arm-thumb-undefined-weak-ld
>>
>> arm-thumb-undefined-weak-ld:    file format ELF32-arm-little
>>
>> Disassembly of section .text:
>> .text:
>>     11006:       00 00   movs    r0, r0
>>
>> _start:
>>     11008:       2e f4 fa af     beq.w   #-69644
>>     1100c:       00 e0   b       #0 <_start+0x8>
>>     1100e:       00 bf   nop
>>     11010:       00 e0   b       #0 <_start+0xC>
>>     11012:       00 bf   nop
>>     11014:       00 e0   b       #0 <_start+0x10>
>>     11016:       00 bf   nop
>>     11018:       cf f6 fe 70     movt    r0, #65534
>>     1101c:       4e f6 e4 70     movw    r0, #61412
>>
>>
>>
>>
>> 在 2017年09月15日 20:49, Peter Smith 写道:
>>> Just a thought I had about the calculation of P. I think that
>>> following the ld approach too closely may be a mistake.
>>>
>>> I'm speculating that the reason for this change in the value of P is
>>> similar to the situation in Arm for a Thumb BLX immediate instruction
>>> (Branch Link and Exchange with the immediate an offset from the PC).
>>> When calculating the target address the immediate is added to
>>> Align(PC, 4) where Align rounds down to nearest 4-byte boundary. The
>>> linker needs to account for this when resolving the relocation
>>> R_ARM_THM_CALL.
>>>
>>> To handle the alignment difference for this one special case in lld I
>>> accounted for the alignment difference in relocateOne. You may be able
>>> to use a similar method for Arc rather than writing modifyARCAddrLoc.
>>> Again I know nothing about Arc so you'll need to look at the
>>> Architecture reference manual to understand what the instruction the
>>> relocation applies to works.
>>>
>>> Peter
>>>
>>>
>>> On 15 September 2017 at 04:19, Leslie Zhai <lesliezhai at llvm.org.cn> wrote:
>>>> Hi Peter,
>>>>
>>>> Thanks for your kind response!
>>>>
>>>>
>>>> 在 2017年09月14日 17:36, Peter Smith 写道:
>>>>> Hello Leslie,
>>>>>
>>>>> I think we are going to need to know a bit more about the ELF ABI for
>>>>> what looks like the ArcCompact before we can help you.
>>>> https://github.com/foss-for-synopsys-dwc-arc-processors/arc-ABI-manual
>>>>
>>>> But I prefer to read  bfd linker's source code about ARC instead:
>>>> 1. Specific e_flags
>>>>
>>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/elf/arc.h
>>>> 2. Relocation define
>>>>
>>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/elf/arc-reloc.def
>>>> 3. Relocation replace function
>>>>
>>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/opcode/arc-func.h
>>>> 4. Calculation of S, A, P, PDATA, GOT, etc.
>>>>
>>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/bfd/elf32-arc.c#L1156
>>>>
>>>>
>>>>> LLD's calculation of P (the place to be relocated) is as it is in the
>>>>> generic ELF specification. The Rel.Offset corresponds to the ELF
>>>>> r_offset field. This is covered by: "For a relocatable file, the value
>>>>> is the byte offset from the beginning of the section to the storage
>>>>> unit affected by the relocation."
>>>>>
>>>>> For LLD we are calculating the virtual address (VA) of P, as I
>>>>> understand it this is equivalent to the vma used in BFD. Assuming that
>>>>> the relocation is relocating a regular InputSection from the
>>>>> basic-arc.o object then the LLD calculation of P =
>>>>> getOutputSection()->Addr + getOffset(Rel.Offset); translates to: (VA
>>>>> of OutputSection) + (Offset of InputSection within OutputSection) +
>>>>> (Offset within InputSection given by r_offset)
>>>>>
>>>>> The BFD linker seems to be doing the equivalent calculation with an
>>>>> extra modification of the (Offset within InputSection given by
>>>>> r_offset) and is rounding down the result to the nearest 4-byte
>>>>> boundary. This looks unfamiliar to me, and could well be specific to
>>>>> ArcCompact. I think that you will need to refer to the ELF ABI
>>>>> documentation as this should tell you if there are any processor
>>>>> specific modifications to generic ELF that you have to follow.
>>>> I implemented the MOD P for ARC:
>>>>
>>>> static void modifyARCAddrLoc(uint64_t &AddrLoc, const uint16_t EMachine,
>>>>                                RelExpr Expr, uint32_t Type, uint64_t VMA,
>>>>                                uint64_t OutSecOff, uint64_t RelOff) {
>>>>     if (EMachine != EM_ARC_COMPACT || EMachine != EM_ARC_COMPACT2 ||
>>>>         Expr != R_PC || Expr != R_GOT_PC) {
>>>> return;
>>>> }
>>>>
>>>>     uint64_t M = 0;
>>>>     if (Type == R_ARC_32_PCREL || Type == R_ARC_PC32 || Type ==
>>>> R_ARC_GOTPC32
>>>> ||
>>>>         Type == R_ARC_GOTPC) {
>>>>       M = 4; // bitsize >= 32 ? 4 : 0
>>>> }
>>>>     AddrLoc = (VMA + OutSecOff + RelOff - M) & ~0x3;
>>>> }
>>>>
>>>> modifyARCAddrLoc(AddrLoc, Config->EMachine, Expr, Type,
>>>>                        getOutputSection()->Addr,  <-- VMA is important!
>>>>                        cast<InputSection>(this)->OutSecOff, Rel.Offset);
>>>>
>>>>
>>>>> The other thing that you should do is try and work out why the VA
>>>>> (vma) is 6 in LD and 8 in LLD and whether this is actually a problem.
>>>>> The VA of the OutputSection is not guaranteed to be the same between
>>>>> different linkers so it may have just been that differences in order
>>>>> of InputSections or alignment has caused a different VA. I would check
>>>>> the output of the linker map file to see where it placed the Output
>>>>> and Input Sections to see what the answer should be.
>>>> LLD's getOutputSection()->Addr =
>>>> https://github.com/llvm-mirror/lld/blob/master/ELF/LinkerScript.cpp#L530
>>>>
>>>>
>>>>
>>>>> In summary:
>>>>> It looks like there are some Arc specific things that might need to be
>>>>> done. Unfortunately I don't have any experience with Arc, and I'm not
>>>>> sure the other people that work on LLD do either. I suggest looking at
>>>>> the public ABI documentation and making any arguments for changes
>>>>> based on that documentation, it is worth assuming that we know nothing
>>>>> about Arc, don't have the documentation to hand and don't know where
>>>>> to find it!
>>>>>
>>>>> Hope that is of some help, with a bit more context I might be able to
>>>>> help a bit more, unfortunately I can't spend a lot of time learning
>>>>> about Arc.
>>>>>
>>>>> Peter
>>>>>
>>>>>
>>>>> On 14 September 2017 at 07:16, Leslie Zhai via llvm-dev
>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>> Hi LLVM developers,
>>>>>>
>>>>>> basic-arc.s:
>>>>>>
>>>>>> main:
>>>>>>      bl memset
>>>>>>
>>>>>> $ arc-elf32-gcc -mcpu=arc600 -o basic-arc.o -c
>>>>>>
>>>>>> $ arc-elf32-readelf -r basic-arc.o
>>>>>>
>>>>>> Relocation section '.rela.text' at offset 0xd4 contains 1 entries:
>>>>>>     Offset     Info    Type            Sym.Value  Sym. Name + Addend
>>>>>> 00000000  00000611 R_ARC_S25W_PCREL  00000000   memset + 0
>>>>>>
>>>>>> High address: 0x0
>>>>>>
>>>>>> $ arc-elf32-ld -o basic-arc basic-arc.o
>>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/arc600
>>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/../../../../arc-elf32/lib/arc600
>>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1
>>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/../../../../arc-elf32/lib
>>>>>> --start-group -lgcc -lc -lnosys --end-group -Ttext=0
>>>>>>
>>>>>> DEBUG: arc-ld: R_ARC_S25W_PCREL relocation: 1 S: 4 A: 0 P: 0 = (vma: 0
>>>>>> +
>>>>>> output_offset: 0 + reloc_offset: 0 - 0) & ~0x3
>>>>>> DEBUG: arc-ld: type: R_ARC_S25W_PCREL insn: 2054
>>>>>>
>>>>>> $ ld.lld -o basic-arc-lld basic-arc.o $ARC_LINKER_LIB -Ttext=0
>>>>>>
>>>>>> DEBUG: lld: R_ARC_S25W_PCREL TargetVA: 4 A: 0 P: 0 <-- same P as arc-ld
>>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2050 Rel: 1
>>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2054 <-- same relocation value as
>>>>>> arc-ld
>>>>>>
>>>>>> But with several different high address *not* 0x0, such as 0x6:
>>>>>>
>>>>>> DEBUG: arc-ld: R_ARC_S25W_PCREL relocation: 2 S: 12 A: 0 P: 4 = (vma: 6
>>>>>> +
>>>>>> output_offset: 0 + reloc_offset: 0 - 0) & ~0x3
>>>>>> DEBUG: arc-ld: type: R_ARC_S25W_PCREL insn: 2058
>>>>>>
>>>>>> DEBUG: lld: R_ARC_S25W_PCREL TargetVA: 4 A: 0 P: 8 <-- different P?
>>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2050 Rel: 1
>>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2054 <-- different relocation value
>>>>>>
>>>>>> How arc-ld calculates P?
>>>>>>
>>>>>> P = ((reloc_data.input_section->output_section ?
>>>>>> reloc_data.input_section->output_section->vma : 0) +
>>>>>> reloc_data.input_section->output_offset + (reloc_data.reloc_offset -
>>>>>> (reloc_data.bitsize >= 32 ? 4 : 0))) & ~0x3;
>>>>>>
>>>>>> for example, R_ARC_S25W_PCREL's bitsize < 32, P = (6 + 0 + 0 - 0) &
>>>>>> ~0x3
>>>>>> =
>>>>>> 4, when vma is 6, output and reloc offset is 0.
>>>>>>
>>>>>> How LLD calculates P (AddrLoc)?
>>>>>>
>>>>>> P = getOutputSection()->Addr + getOffset(Rel.Offset);
>>>>>>
>>>>>> for example, the same high address 0x6, LLD's P is 8, different with
>>>>>> arc-ld?
>>>>>> so do I need to modify the value of P for R_PC case in the
>>>>>> getRelocTargetVA?
>>>>>> please give me some hints, thanks a lot!
>>>>>>
>>>>>>
>>>>>> PS: arc-ld R_ARC_S25W_PCREL's FORMULA is: ( S + A ) - P ) >> 2, and it
>>>>>> needs
>>>>>> middle endian convert, so:
>>>>>>
>>>>>> Insn = middleEndianConvert (insn, TRUE);
>>>>>>
>>>>>> Insn = replaceDisp25w(Insn, ( S + A ) - P ) >> 2);
>>>>>>
>>>>>> Insn = middleEndianConvert (insn, TRUE);
>>>>>>
>>>>>> write32le(Loc, Insn);
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>> --
>>>> Regards,
>>>> Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
>>>>
>>>>
>>>>
>> --
>> Regards,
>> Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
>>
>>
>>

-- 
Regards,
Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/





More information about the llvm-dev mailing list