[LLVMdev] [lld] Representation of lld::Reference with a fake target

Michael Spencer bigcheesegs at gmail.com
Fri Feb 6 17:58:38 PST 2015


On Fri, Feb 6, 2015 at 5:54 PM, Rui Ueyama <ruiu at google.com> wrote:
> On Fri, Feb 6, 2015 at 5:42 PM, Michael Spencer <bigcheesegs at gmail.com>
> wrote:
>>
>> On Fri, Feb 6, 2015 at 5:31 PM, Rui Ueyama <ruiu at google.com> wrote:
>> > There are two questions.
>> >
>> > Firstly, do you think the on-disk format needs to compatible with a C++
>> > struct so that we can cast that memory buffer to the struct? That may be
>> > super-fast but that also comes with many limitations. It's hard to
>> > extend,
>> > for example. Every time we want to store variable-length objects we need
>> > to
>> > define string-table-like data structure. And I'm not very sure that it's
>> > fastest -- because mmap'able objects are not very compact on disk, slow
>> > disk
>> > IO could be a bottleneck, if we compare that with more compact file
>> > format.
>> > I believe Protobufs or Thrust are fast enough or even might be faster.
>>
>> I'm not sure here. Although I do question if the object files will
>> even need to be read from disk in your standard edit/compile/debug
>> loop or on a build server. I believe we'll need real data to determine
>> this.
>>
>> >
>> > Secondly, do you know why we are dumping post-linked object file to
>> > Native
>> > format? If we want to have a different kind of *object* file format, we
>> > would want to have a tool to convert an object file in an existing file
>> > format (say, ELF) to "native", and teach LLD how read from the file.
>> > Currently we are writing a file in the middle of linking process, which
>> > doesn't make sense to me.
>>
>> This is an artifact of having the native format before we had any
>> readers. I agree that it's weird and not terribly useful to write to
>> native format in the middle of the link, although I have found it
>> helpful to output yaml. There's no need to be able to read it back in
>> and resume though.
>
>
> Even for YAML it doesn't make much sense to write it to a file and read it
> back from the file in the middle of the link, do it? I found that being able
> to output YAML is useful too, but round-trip is a different thing. In the
> middle of the process, we have bunch of additional information that doesn't
> exist in input files and doesn't have to be output to the link result.
> Ability to serialize that intermediate result is not useful.

Completely agree here. We should round-trip the input instead.

- Michael Spencer

>
> Shankar, you added these round-trip tests. Do you have any opinion?
>
>> Ideally lld -r would be the tool we use to convert COFF/ELF/MachO to
>> the native format.
>>
>> - Michael Spencer
>>
>> >
>> > On Fri, Feb 6, 2015 at 5:02 PM, Michael Spencer <bigcheesegs at gmail.com>
>> > wrote:
>> >>
>> >> On Fri, Feb 6, 2015 at 2:54 PM, Rui Ueyama <ruiu at google.com> wrote:
>> >> > Can we remove Native format support? I'd like to get input from
>> >> > anyone
>> >> > who
>> >> > wants to keep the current Native format in LLD.
>> >>
>> >> One of the original goals for LLD was to provide a new object file
>> >> format for performance. The reason it is not used currently is because
>> >> we've yet to teach llvm to generate it, and we haven't done that
>> >> because it hasn't been finalized yet. The value it currently provides
>> >> is catching stuff like this, so we can fix it now instead of down the
>> >> road when we actually productize the native format.
>> >>
>> >> As for the specific implementation of the native format, I'm open to
>> >> an extensible format, but only if the performance cost is low.
>> >>
>> >> - Michael Spencer
>> >>
>> >> >
>> >> > On Thu, Feb 5, 2015 at 2:03 PM, Shankar Easwaran
>> >> > <shankare at codeaurora.org>
>> >> > wrote:
>> >> >>
>> >> >> The only way currently is to create a new reference, unless we can
>> >> >> think
>> >> >> of adding some target specific metadata information in the Atom
>> >> >> model.
>> >> >>
>> >> >> This has come up over and over again, we need something in the Atom
>> >> >> model
>> >> >> to store information that is target specific.
>> >> >>
>> >> >> Shankar Easwaran
>> >> >>
>> >> >>
>> >> >> On 2/5/2015 2:22 PM, Simon Atanasyan wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> I need an advice on implementation of a very specific kind of
>> >> >>> relocations
>> >> >>> used by MIPS N64 ABI. As usual the main problem is how to pass
>> >> >>> target
>> >> >>> specific
>> >> >>> data over Native/YAML conversion barrier.
>> >> >>>
>> >> >>> In this ABI relocation record r_info field in fact consists of five
>> >> >>> subfields:
>> >> >>> * r_sym   - symbol index
>> >> >>> * r_ssym  - special symbol
>> >> >>> * r_type3 - third relocation type
>> >> >>> * r_type2 - second relocation type
>> >> >>> * r_type  - first relocation type
>> >> >>>
>> >> >>> Up to three these relocations applied one by one. The first
>> >> >>> relocation
>> >> >>> uses
>> >> >>> an addendum from the relocation record. Each subsequent relocation
>> >> >>> takes
>> >> >>> as
>> >> >>> its addend the result of the previous operation. Only the final
>> >> >>> operation
>> >> >>> actually modifies the location relocated. The first relocation uses
>> >> >>> as
>> >> >>> a reference symbol specified by the r_sym field. The third
>> >> >>> relocation
>> >> >>> assumes NULL symbol.
>> >> >>>
>> >> >>> The most interesting case is the second relocation. It uses the
>> >> >>> special
>> >> >>> symbol value given by the r_ssym field. This field can contain four
>> >> >>> predefined values:
>> >> >>> * RSS_UNDEF - zero value
>> >> >>> * RSS_GP    - value of gp symbol
>> >> >>> * RSS_GP0   - gp0 value taken from the .MIPS.options or .reginfo
>> >> >>> section
>> >> >>> * RSS_LOC   - address of location being relocated
>> >> >>>
>> >> >>> So the problem is how to store these four constants in the
>> >> >>> lld::Reference object.
>> >> >>> The RSS_UNDEF is obviously not a problem. To represent the RSS_GP
>> >> >>> value I
>> >> >>> can
>> >> >>> set an AbsoluteAtom created for the "_gp" as the reference's
>> >> >>> target.
>> >> >>> But
>> >> >>> what
>> >> >>> about RSS_GP0 and RSS_LOC? I am considering the following
>> >> >>> approaches
>> >> >>> but
>> >> >>> cannot
>> >> >>> select the best one:
>> >> >>>
>> >> >>> a) Create AbsoluteAtom for each of these cases and set them as the
>> >> >>> reference's target.
>> >> >>>     The problem is that these atoms are fake and should not go to
>> >> >>> the
>> >> >>> symbol table.
>> >> >>>     One more problem is to select unique names for these atoms.
>> >> >>> b) Use two high bits of lld::Reference::_kindValue field to encode
>> >> >>> RSS_xxx value.
>> >> >>>     Then decode these bits in the RelocationHandler to calculate
>> >> >>> result
>> >> >>> of relocation.
>> >> >>>     In that case the problem is how to represent a relocation kind
>> >> >>> value in YAML format.
>> >> >>>     The simple xxxRelocationStringTable::kindStrings[] array will
>> >> >>> not
>> >> >>> satisfy us.
>> >> >>> c) Add one more field to the lld::Reference class. Something like
>> >> >>> the
>> >> >>> DefinedAtom::CodeModel
>> >> >>>     field.
>> >> >>>
>> >> >>> Any advices, ideas, and/or objections are much appreciated.
>> >> >>>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> >> >> hosted
>> >> >> by the Linux Foundation
>> >> >>
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > LLVM Developers mailing list
>> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >> >
>> >
>> >
>
>



More information about the llvm-dev mailing list