[PATCH] D22683: [ELF] Symbol assignment within input section list

Tue Aug 2 15:15:31 PDT 2016

2016-08-03 0:55 GMT+03:00 Rui Ueyama <ruiu at google.com>:

> On Tue, Aug 2, 2016 at 2:45 PM, Eugene Leviant <evgeny.leviant at gmail.com>
> wrote:
>
>>
>>
>> среда, 3 августа 2016 г. пользователь Rui Ueyama написал:
>>
>>> On Tue, Aug 2, 2016 at 3:00 AM, Eugene Leviant <evgeny.leviant at gmail.com
>>> > wrote:
>>>
>>>> evgeny777 added a comment.
>>>>
>>>> I think the main reason, we're using virtual input sections is that
>>>> this the only way to calculate correct symbol offset. As you may know
>>>> location counter is not incremented while we add input sections to output
>>>> section, and the true size of input sections is known only after call to
>>>> OutputSectionBase<ELFT>::assignOffsets().
>>>>
>>>> So if you suggest an algorithm, which can calculate correct symbol
>>>> value (w/o using virtual input sections) in the case below:
>>>>
>>>>   .foo : { *(.foo); end_foo = .; *(.bar) }
>>>>
>>>> then we can probably switch to absolute symbols (BTW we can also use
>>>> synthetic symbols - there is a little difference, if any).
>>>> Another interesting question is what will happen if we define absolute
>>>> symbol in shared object and reference it in executable? For example:
>>>>
>>>>   /* script for linking shared library */
>>>>   SECTIONS { .text : { text_start = .; *(.text) } }
>>>>
>>>> So, when shared library is loaded by application, what value would
>>>> text_start have, in case it is absolute? I don't know yet, but will try.
>>>>
>>>
>>> At first, I suggested you use empty dummy input sections to define
>>> linker-script-defined symbols in the hope that in that way we don't need to
>>> fix symbol addresses later (I was hoping that symbol addresses are
>>> automatically fixed as attached input sections get final output addresses.)
>>> Now that we know it doesn't work for many possible use cases. Then maybe we
>>> want to eliminated dummy sections and directly define symbols as absolute
>>> (or section) symbols.
>>>
>>
>> Like I said, the main problem is calculating this "absolute value". How
>> are you going to do this? Also, like George said, it is not correct to use
>> absolute values for symbols defined inside output section description
>>
>
> I think you don't need to calculate absolute values. We know the relative
> distance from beginning of the current output section and the current "."
> value, so we can create a DefinedSynthetic symbol with the output section
> and the relative offset.
>
> Still have to deal with thunks, changing input section size, no?

>
>>
>>> In this patch, you are trying to support assignments to symbols.
>>> However, we eventually want to support something like this, too.
>>>
>>>   SECTIONS { .text : { foo.o(.text); . = ALIGN(128); bar.o(.text) } }
>>>
>>
>> I do not see any problem in doing this. I think we use the same
>> SymbolInputSection<ELFT> but with non-zero size, so proper layout will be
>> calculated automatically in assignOffsets. Does this make sense?
>>
>
> I don't think so. Does it work for more complicated inputs, such as
>
>   SECTIONS {
>     .data { *(.data) }
>     .text : { foo.o(.text); . += SIZEOF(.data); bar.o(.text) }
>   }
>
> ?
>

I think making InputSectionBase<ELFT>::getSize() a virtual method will
solve the problem, won't it?

>
>
>>
>>> Looks like this doesn't fit to the current architecture. Currently, we
>>> create a list of input sections and assign them addresses later. But in
>>> order to process the above script, one pass would fit well. So I'm
>>> wondering if we should merge LinkerScript::createSections and
>>> LinkerScript::assignOffsets.
>>>
>>
>> How can this be done? We have createThunks() in between.
>>
>
> Yeah, we have Thunks. I haven't thought enough about that yet. But why we
> can't create thunks earlier, even before createSections?
>

Is this possible? As far as I understand thunk contains jump. which can be
between two input sections (or even output sections). Until you create full
layout (like we do in createSections), it looks
like a tough problem to solve.

>
>
>>>
>>>>
>>>> ================
>>>> Comment at: ELF/LinkerScript.cpp:278
>>>> @@ -176,3 +277,3 @@
>>>>  // Process ONLY_IF_RO and ONLY_IF_RW.
>>>>  template <class ELFT> void LinkerScript<ELFT>::filter() {
>>>>    // In this loop, we remove output sections if they don't satisfy
>>>> ----------------
>>>> ruiu wrote:
>>>> > Why did you have to make a change to this function?
>>>> Two main reasons:
>>>>
>>>> 1) During filtering process some output sections may be removed. Those
>>>> sections may contain symbols and SymbolInputSection object have already
>>>> been created for them. To avoid crashes and/or creating dummy symbols I
>>>> have to remove those virtual sections as well
>>>>
>>>> 2) The old implementation is not technically correct, because it
>>>> removes only first output section found in name lookup. We're still using
>>>> OutputSectionFactory<ELFT>, so we may have several sections with the same
>>>> name.
>>>>
>>>> Another reason (though much less significant) is that one-by-one
>>>> removal from std::vector must be slow, because it stores elements on
>>>> continuous region of memory.
>>>>
>>>>
>>>> https://reviews.llvm.org/D22683
>>>>
>>>>
>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160803/b88abc3a/attachment-0001.html>