[PATCH] D22683: [ELF] Symbol assignment within input section list

Tue Aug 2 14:55:54 PDT 2016

On Tue, Aug 2, 2016 at 2:45 PM, Eugene Leviant <evgeny.leviant at gmail.com>
wrote:

>
>
> среда, 3 августа 2016 г. пользователь Rui Ueyama написал:
>
>> On Tue, Aug 2, 2016 at 3:00 AM, Eugene Leviant <evgeny.leviant at gmail.com>
>> wrote:
>>
>>> evgeny777 added a comment.
>>>
>>> I think the main reason, we're using virtual input sections is that this
>>> the only way to calculate correct symbol offset. As you may know location
>>> counter is not incremented while we add input sections to output section,
>>> and the true size of input sections is known only after call to
>>> OutputSectionBase<ELFT>::assignOffsets().
>>>
>>> So if you suggest an algorithm, which can calculate correct symbol value
>>> (w/o using virtual input sections) in the case below:
>>>
>>>   .foo : { *(.foo); end_foo = .; *(.bar) }
>>>
>>> then we can probably switch to absolute symbols (BTW we can also use
>>> synthetic symbols - there is a little difference, if any).
>>> Another interesting question is what will happen if we define absolute
>>> symbol in shared object and reference it in executable? For example:
>>>
>>>   /* script for linking shared library */
>>>   SECTIONS { .text : { text_start = .; *(.text) } }
>>>
>>> So, when shared library is loaded by application, what value would
>>> text_start have, in case it is absolute? I don't know yet, but will try.
>>>
>>
>> At first, I suggested you use empty dummy input sections to define
>> linker-script-defined symbols in the hope that in that way we don't need to
>> fix symbol addresses later (I was hoping that symbol addresses are
>> automatically fixed as attached input sections get final output addresses.)
>> Now that we know it doesn't work for many possible use cases. Then maybe we
>> want to eliminated dummy sections and directly define symbols as absolute
>> (or section) symbols.
>>
>
> Like I said, the main problem is calculating this "absolute value". How
> are you going to do this? Also, like George said, it is not correct to use
> absolute values for symbols defined inside output section description
>

I think you don't need to calculate absolute values. We know the relative
distance from beginning of the current output section and the current "."
value, so we can create a DefinedSynthetic symbol with the output section
and the relative offset.

>
>> In this patch, you are trying to support assignments to symbols. However,
>> we eventually want to support something like this, too.
>>
>>   SECTIONS { .text : { foo.o(.text); . = ALIGN(128); bar.o(.text) } }
>>
>
> I do not see any problem in doing this. I think we use the same
> SymbolInputSection<ELFT> but with non-zero size, so proper layout will be
> calculated automatically in assignOffsets. Does this make sense?
>

I don't think so. Does it work for more complicated inputs, such as

  SECTIONS {
    .data { *(.data) }
    .text : { foo.o(.text); . += SIZEOF(.data); bar.o(.text) }
  }

?

>
>> Looks like this doesn't fit to the current architecture. Currently, we
>> create a list of input sections and assign them addresses later. But in
>> order to process the above script, one pass would fit well. So I'm
>> wondering if we should merge LinkerScript::createSections and
>> LinkerScript::assignOffsets.
>>
>
> How can this be done? We have createThunks() in between.
>

Yeah, we have Thunks. I haven't thought enough about that yet. But why we
can't create thunks earlier, even before createSections?

>>
>>>
>>> ================
>>> Comment at: ELF/LinkerScript.cpp:278
>>> @@ -176,3 +277,3 @@
>>>  // Process ONLY_IF_RO and ONLY_IF_RW.
>>>  template <class ELFT> void LinkerScript<ELFT>::filter() {
>>>    // In this loop, we remove output sections if they don't satisfy
>>> ----------------
>>> ruiu wrote:
>>> > Why did you have to make a change to this function?
>>> Two main reasons:
>>>
>>> 1) During filtering process some output sections may be removed. Those
>>> sections may contain symbols and SymbolInputSection object have already
>>> been created for them. To avoid crashes and/or creating dummy symbols I
>>> have to remove those virtual sections as well
>>>
>>> 2) The old implementation is not technically correct, because it removes
>>> only first output section found in name lookup. We're still using
>>> OutputSectionFactory<ELFT>, so we may have several sections with the same
>>> name.
>>>
>>> Another reason (though much less significant) is that one-by-one removal
>>> from std::vector must be slow, because it stores elements on continuous
>>> region of memory.
>>>
>>>
>>> https://reviews.llvm.org/D22683
>>>
>>>
>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160802/2dd94d8a/attachment.html>