[llvm-dev] Do I need to modify the AddrLoc of LLD for ARC target?

Peter Smith via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 18 05:44:05 PDT 2017


Hello Leslie,

I don't know quite what to say as I don't know precisely what your
question is? If I am not being precise enough please can you put some
explicit questions in? From what I can see in the output, here are
some comments.

>From your arc mapfiles it looks like that in the output both linker's
have given the .text output section the correct base address given the
alignment restrictions as the alignment requirement of .text from
lib_a-memset-bs.o is 4, therefor the alignment requirement of the
OutputSection .text should be 4:
LLD:
Address    Size              Alignment
00000000 00000080     4 .text
00000000 00000004     1         basic-arc.o:(.text)
00000000 00000000     0                 main
00000004 0000007c     4         ... (lib_a-memset-bs.o):(.text)

LD
.text           0x0000000000000000       0x80
 *(.text .stub .text.* .gnu.linkonce.t.*)
 .text          0x0000000000000000        0x4 basic-arc.o
 .text          0x0000000000000004       0x7c ... libc.a(lib_a-memset-bs.o)
                0x0000000000000004                memset
                0x0000000000000060                __strncpy_bzero

I'm not entirely sure where the Arm example has come from, but it does
show an interesting difference. It looks like the linker's are
handling the -ttext <address> option slightly differently when the
<address> of the OutputSection is not 0 modulo OutputSection
alignment.

>From the map file we can see that lld is aligning the OutputSection to
the nearest 4-byte boundary, GNU-ld is placing the OutputSection on
the requested address, but is adding padding before the .text section
to make sure that in the final executable the InputSection is aligned.

LLD
Address  Size     Align Out     In      Symbol
00011008 00000018     4 .text
00011008 00000018     4         arm-thumb-undefined-weak.o:(.text)
00011008 00000000     0                 $t.0
00011008 00000000     0                 _start

LD
.text           0x0000000000011006       0x1a
...
 *fill*         0x0000000000011006        0x2
 .text          0x0000000000011008       0x18 arm-thumb-undefined-weak.o

The *fill* is visible as a nop in the disassembly for the LD produced image.

Strictly speaking I think LD is producing a file that doesn't strictly
conform to ELF here as the sh_addr of the .text OutputSection is 0
modulo sh_addralign (4). In practice it probably wouldn't make much
difference. My preference is for LLD's behaviour here.

Peter



On 18 September 2017 at 03:28, Leslie Zhai <lesliezhai at llvm.org.cn> wrote:
> Hi Peter,
>
> Map file about LD for ARC target
> https://drive.google.com/open?id=0ByE8c-y74l_uRWpQdUh2c0VXZ1k
>
> LLD for ARC https://drive.google.com/open?id=0ByE8c-y74l_ueGVuYkR0a3RSWjQ
>
>
> arm-thumb-undefined-weak.s
> https://github.com/llvm-mirror/lld/blob/master/test/ELF/arm-thumb-undefined-weak.s
>
> $ llvm/build/bin/llvm-mc -filetype=obj -triple=thumbv7a-none-linux-gnueabi
> arm-thumb-undefined-weak.s -o arm-thumb-undefined-weak.o
> $ llvm/build/bin/ld.lld -o arm-thumb-undefined-weak-lld
> arm-thumb-undefined-weak.o -Ttext=11006
> $ arm-linux-gnu-ld -o arm-thumb-undefined-weak-ld arm-thumb-undefined-weak.o
> -Ttext=11006
>
> $ arm-linux-gnu-readelf -r arm-thumb-undefined-weak.o
>
> Relocation section '.rel.text' at offset 0x8c contains 6 entries:
>  Offset     Info    Type            Sym.Value  Sym. Name
> 00000000  00000333 R_ARM_THM_JUMP19  00000000   target
> 00000004  0000031e R_ARM_THM_JUMP24  00000000   target
> 00000008  0000030a R_ARM_THM_CALL    00000000   target
> 0000000c  0000030a R_ARM_THM_CALL    00000000   target
> 00000010  00000332 R_ARM_THM_MOVT_PR 00000000   target
> 00000014  00000331 R_ARM_THM_MOVW_PR 00000000   target
>
>
> DEBUG: lld: R_ARM_THM_JUMP19 TargetVA: 0 A: -4 P: 69640 Align: 4 VMA: 69640
> Output Offset: 0 Reloc Offset: 0
> DEBUG: lld: R_ARM_THM_JUMP24 TargetVA: 0 A: -4 P: 69644 Align: 4 VMA: 69640
> Output Offset: 0 Reloc Offset: 4
> DEBUG: lld: R_ARM_THM_CALL TargetVA: 1 A: -4 P: 69648 Align: 4 VMA: 69640
> Output Offset: 0 Reloc Offset: 8
> DEBUG: lld: R_ARM_THM_CALL TargetVA: 1 A: -4 P: 69652 Align: 4 VMA: 69640
> Output Offset: 0 Reloc Offset: 12
> DEBUG: lld: R_ARM_THM_MOVT_PREL TargetVA: 0 A: 0 P: 69656 Align: 4 VMA:
> 69640 Output Offset: 0 Reloc Offset: 16
> DEBUG: lld: R_ARM_THM_MOVW_PREL_NC TargetVA: 0 A: 0 P: 69660 Align: 4 VMA:
> 69640 Output Offset: 0 Reloc Offset: 20
>
> DEBUG: arm-linux-gnu-ld: R_ARM_THM_JUMP19: VMA: 69638 Output Offset: 2 Reloc
> Offset: 0
> DEBUG: arm-linux-gnu-ld: R_ARM_THM_JUMP24: VMA: 69638 Output Offset: 2 Reloc
> Offset: 4
> DEBUG: arm-linux-gnu-ld: R_ARM_THM_CALL: VMA: 69638 Output Offset: 2 Reloc
> Offset: 8
> DEBUG: arm-linux-gnu-ld: R_ARM_THM_CALL: VMA: 69638 Output Offset: 2 Reloc
> Offset: 12
> DEBUG: arm-linux-gnu-ld: R_ARM_THM_MOVT_PREL: VMA: 69638 Output Offset: 2
> Reloc Offset: 16
> DEBUG: arm-linux-gnu-ld: R_ARM_THM_MOVW_PREL_NC: VMA: 69638 Output Offset: 2
> Reloc Offset: 20
>
>
> $ llvm/build/bin/llvm-objdump -triple=thumbv7a-none-linux-gnueabi -d
> arm-thumb-undefined-weak-lld
>
> arm-thumb-undefined-weak-lld:   file format ELF32-arm-little
>
> Disassembly of section .text:
> _start:
>    11008:       00 f0 00 80     beq.w   #0 <_start+0x4>
>    1100c:       00 f0 00 b8     b.w     #0 <_start+0x8>
>    11010:       00 f0 00 f8     bl      #0
>    11014:       00 f0 00 f8     bl      #0
>    11018:       c0 f2 00 00     movt    r0, #0
>    1101c:       40 f2 00 00     movw    r0, #0
>
>
> $ llvm/build/bin/llvm-objdump -triple=thumbv7a-none-linux-gnueabi -d
> arm-thumb-undefined-weak-ld
>
> arm-thumb-undefined-weak-ld:    file format ELF32-arm-little
>
> Disassembly of section .text:
> .text:
>    11006:       00 00   movs    r0, r0
>
> _start:
>    11008:       2e f4 fa af     beq.w   #-69644
>    1100c:       00 e0   b       #0 <_start+0x8>
>    1100e:       00 bf   nop
>    11010:       00 e0   b       #0 <_start+0xC>
>    11012:       00 bf   nop
>    11014:       00 e0   b       #0 <_start+0x10>
>    11016:       00 bf   nop
>    11018:       cf f6 fe 70     movt    r0, #65534
>    1101c:       4e f6 e4 70     movw    r0, #61412
>
>
>
>
> 在 2017年09月15日 20:49, Peter Smith 写道:
>>
>> Just a thought I had about the calculation of P. I think that
>> following the ld approach too closely may be a mistake.
>>
>> I'm speculating that the reason for this change in the value of P is
>> similar to the situation in Arm for a Thumb BLX immediate instruction
>> (Branch Link and Exchange with the immediate an offset from the PC).
>> When calculating the target address the immediate is added to
>> Align(PC, 4) where Align rounds down to nearest 4-byte boundary. The
>> linker needs to account for this when resolving the relocation
>> R_ARM_THM_CALL.
>>
>> To handle the alignment difference for this one special case in lld I
>> accounted for the alignment difference in relocateOne. You may be able
>> to use a similar method for Arc rather than writing modifyARCAddrLoc.
>> Again I know nothing about Arc so you'll need to look at the
>> Architecture reference manual to understand what the instruction the
>> relocation applies to works.
>>
>> Peter
>>
>>
>> On 15 September 2017 at 04:19, Leslie Zhai <lesliezhai at llvm.org.cn> wrote:
>>>
>>> Hi Peter,
>>>
>>> Thanks for your kind response!
>>>
>>>
>>> 在 2017年09月14日 17:36, Peter Smith 写道:
>>>>
>>>> Hello Leslie,
>>>>
>>>> I think we are going to need to know a bit more about the ELF ABI for
>>>> what looks like the ArcCompact before we can help you.
>>>
>>> https://github.com/foss-for-synopsys-dwc-arc-processors/arc-ABI-manual
>>>
>>> But I prefer to read  bfd linker's source code about ARC instead:
>>> 1. Specific e_flags
>>>
>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/elf/arc.h
>>> 2. Relocation define
>>>
>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/elf/arc-reloc.def
>>> 3. Relocation replace function
>>>
>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/opcode/arc-func.h
>>> 4. Calculation of S, A, P, PDATA, GOT, etc.
>>>
>>> https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/bfd/elf32-arc.c#L1156
>>>
>>>
>>>> LLD's calculation of P (the place to be relocated) is as it is in the
>>>> generic ELF specification. The Rel.Offset corresponds to the ELF
>>>> r_offset field. This is covered by: "For a relocatable file, the value
>>>> is the byte offset from the beginning of the section to the storage
>>>> unit affected by the relocation."
>>>>
>>>> For LLD we are calculating the virtual address (VA) of P, as I
>>>> understand it this is equivalent to the vma used in BFD. Assuming that
>>>> the relocation is relocating a regular InputSection from the
>>>> basic-arc.o object then the LLD calculation of P =
>>>> getOutputSection()->Addr + getOffset(Rel.Offset); translates to: (VA
>>>> of OutputSection) + (Offset of InputSection within OutputSection) +
>>>> (Offset within InputSection given by r_offset)
>>>>
>>>> The BFD linker seems to be doing the equivalent calculation with an
>>>> extra modification of the (Offset within InputSection given by
>>>> r_offset) and is rounding down the result to the nearest 4-byte
>>>> boundary. This looks unfamiliar to me, and could well be specific to
>>>> ArcCompact. I think that you will need to refer to the ELF ABI
>>>> documentation as this should tell you if there are any processor
>>>> specific modifications to generic ELF that you have to follow.
>>>
>>> I implemented the MOD P for ARC:
>>>
>>> static void modifyARCAddrLoc(uint64_t &AddrLoc, const uint16_t EMachine,
>>>                               RelExpr Expr, uint32_t Type, uint64_t VMA,
>>>                               uint64_t OutSecOff, uint64_t RelOff) {
>>>    if (EMachine != EM_ARC_COMPACT || EMachine != EM_ARC_COMPACT2 ||
>>>        Expr != R_PC || Expr != R_GOT_PC) {
>>> return;
>>> }
>>>
>>>    uint64_t M = 0;
>>>    if (Type == R_ARC_32_PCREL || Type == R_ARC_PC32 || Type ==
>>> R_ARC_GOTPC32
>>> ||
>>>        Type == R_ARC_GOTPC) {
>>>      M = 4; // bitsize >= 32 ? 4 : 0
>>> }
>>>    AddrLoc = (VMA + OutSecOff + RelOff - M) & ~0x3;
>>> }
>>>
>>> modifyARCAddrLoc(AddrLoc, Config->EMachine, Expr, Type,
>>>                       getOutputSection()->Addr,  <-- VMA is important!
>>>                       cast<InputSection>(this)->OutSecOff, Rel.Offset);
>>>
>>>
>>>> The other thing that you should do is try and work out why the VA
>>>> (vma) is 6 in LD and 8 in LLD and whether this is actually a problem.
>>>> The VA of the OutputSection is not guaranteed to be the same between
>>>> different linkers so it may have just been that differences in order
>>>> of InputSections or alignment has caused a different VA. I would check
>>>> the output of the linker map file to see where it placed the Output
>>>> and Input Sections to see what the answer should be.
>>>
>>> LLD's getOutputSection()->Addr =
>>> https://github.com/llvm-mirror/lld/blob/master/ELF/LinkerScript.cpp#L530
>>>
>>>
>>>
>>>> In summary:
>>>> It looks like there are some Arc specific things that might need to be
>>>> done. Unfortunately I don't have any experience with Arc, and I'm not
>>>> sure the other people that work on LLD do either. I suggest looking at
>>>> the public ABI documentation and making any arguments for changes
>>>> based on that documentation, it is worth assuming that we know nothing
>>>> about Arc, don't have the documentation to hand and don't know where
>>>> to find it!
>>>>
>>>> Hope that is of some help, with a bit more context I might be able to
>>>> help a bit more, unfortunately I can't spend a lot of time learning
>>>> about Arc.
>>>>
>>>> Peter
>>>>
>>>>
>>>> On 14 September 2017 at 07:16, Leslie Zhai via llvm-dev
>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>> Hi LLVM developers,
>>>>>
>>>>> basic-arc.s:
>>>>>
>>>>> main:
>>>>>     bl memset
>>>>>
>>>>> $ arc-elf32-gcc -mcpu=arc600 -o basic-arc.o -c
>>>>>
>>>>> $ arc-elf32-readelf -r basic-arc.o
>>>>>
>>>>> Relocation section '.rela.text' at offset 0xd4 contains 1 entries:
>>>>>    Offset     Info    Type            Sym.Value  Sym. Name + Addend
>>>>> 00000000  00000611 R_ARC_S25W_PCREL  00000000   memset + 0
>>>>>
>>>>> High address: 0x0
>>>>>
>>>>> $ arc-elf32-ld -o basic-arc basic-arc.o
>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/arc600
>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/../../../../arc-elf32/lib/arc600
>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1
>>>>> -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/../../../../arc-elf32/lib
>>>>> --start-group -lgcc -lc -lnosys --end-group -Ttext=0
>>>>>
>>>>> DEBUG: arc-ld: R_ARC_S25W_PCREL relocation: 1 S: 4 A: 0 P: 0 = (vma: 0
>>>>> +
>>>>> output_offset: 0 + reloc_offset: 0 - 0) & ~0x3
>>>>> DEBUG: arc-ld: type: R_ARC_S25W_PCREL insn: 2054
>>>>>
>>>>> $ ld.lld -o basic-arc-lld basic-arc.o $ARC_LINKER_LIB -Ttext=0
>>>>>
>>>>> DEBUG: lld: R_ARC_S25W_PCREL TargetVA: 4 A: 0 P: 0 <-- same P as arc-ld
>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2050 Rel: 1
>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2054 <-- same relocation value as
>>>>> arc-ld
>>>>>
>>>>> But with several different high address *not* 0x0, such as 0x6:
>>>>>
>>>>> DEBUG: arc-ld: R_ARC_S25W_PCREL relocation: 2 S: 12 A: 0 P: 4 = (vma: 6
>>>>> +
>>>>> output_offset: 0 + reloc_offset: 0 - 0) & ~0x3
>>>>> DEBUG: arc-ld: type: R_ARC_S25W_PCREL insn: 2058
>>>>>
>>>>> DEBUG: lld: R_ARC_S25W_PCREL TargetVA: 4 A: 0 P: 8 <-- different P?
>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2050 Rel: 1
>>>>> DEBUG: lld: R_ARC_S25W_PCREL: Insn: 2054 <-- different relocation value
>>>>>
>>>>> How arc-ld calculates P?
>>>>>
>>>>> P = ((reloc_data.input_section->output_section ?
>>>>> reloc_data.input_section->output_section->vma : 0) +
>>>>> reloc_data.input_section->output_offset + (reloc_data.reloc_offset -
>>>>> (reloc_data.bitsize >= 32 ? 4 : 0))) & ~0x3;
>>>>>
>>>>> for example, R_ARC_S25W_PCREL's bitsize < 32, P = (6 + 0 + 0 - 0) &
>>>>> ~0x3
>>>>> =
>>>>> 4, when vma is 6, output and reloc offset is 0.
>>>>>
>>>>> How LLD calculates P (AddrLoc)?
>>>>>
>>>>> P = getOutputSection()->Addr + getOffset(Rel.Offset);
>>>>>
>>>>> for example, the same high address 0x6, LLD's P is 8, different with
>>>>> arc-ld?
>>>>> so do I need to modify the value of P for R_PC case in the
>>>>> getRelocTargetVA?
>>>>> please give me some hints, thanks a lot!
>>>>>
>>>>>
>>>>> PS: arc-ld R_ARC_S25W_PCREL's FORMULA is: ( S + A ) - P ) >> 2, and it
>>>>> needs
>>>>> middle endian convert, so:
>>>>>
>>>>> Insn = middleEndianConvert (insn, TRUE);
>>>>>
>>>>> Insn = replaceDisp25w(Insn, ( S + A ) - P ) >> 2);
>>>>>
>>>>> Insn = middleEndianConvert (insn, TRUE);
>>>>>
>>>>> write32le(Loc, Insn);
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>> --
>>> Regards,
>>> Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
>>>
>>>
>>>
>
> --
> Regards,
> Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
>
>
>


More information about the llvm-dev mailing list