[PATCH] D22683: [ELF] Symbol assignment within input section list

Rui Ueyama via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 2 15:23:44 PDT 2016


On Tue, Aug 2, 2016 at 3:15 PM, Eugene Leviant <evgeny.leviant at gmail.com>
wrote:

>
>
> 2016-08-03 0:55 GMT+03:00 Rui Ueyama <ruiu at google.com>:
>
>> On Tue, Aug 2, 2016 at 2:45 PM, Eugene Leviant <evgeny.leviant at gmail.com>
>> wrote:
>>
>>>
>>>
>>> среда, 3 августа 2016 г. пользователь Rui Ueyama написал:
>>>
>>>> On Tue, Aug 2, 2016 at 3:00 AM, Eugene Leviant <
>>>> evgeny.leviant at gmail.com> wrote:
>>>>
>>>>> evgeny777 added a comment.
>>>>>
>>>>> I think the main reason, we're using virtual input sections is that
>>>>> this the only way to calculate correct symbol offset. As you may know
>>>>> location counter is not incremented while we add input sections to output
>>>>> section, and the true size of input sections is known only after call to
>>>>> OutputSectionBase<ELFT>::assignOffsets().
>>>>>
>>>>> So if you suggest an algorithm, which can calculate correct symbol
>>>>> value (w/o using virtual input sections) in the case below:
>>>>>
>>>>>   .foo : { *(.foo); end_foo = .; *(.bar) }
>>>>>
>>>>> then we can probably switch to absolute symbols (BTW we can also use
>>>>> synthetic symbols - there is a little difference, if any).
>>>>> Another interesting question is what will happen if we define absolute
>>>>> symbol in shared object and reference it in executable? For example:
>>>>>
>>>>>   /* script for linking shared library */
>>>>>   SECTIONS { .text : { text_start = .; *(.text) } }
>>>>>
>>>>> So, when shared library is loaded by application, what value would
>>>>> text_start have, in case it is absolute? I don't know yet, but will try.
>>>>>
>>>>
>>>> At first, I suggested you use empty dummy input sections to define
>>>> linker-script-defined symbols in the hope that in that way we don't need to
>>>> fix symbol addresses later (I was hoping that symbol addresses are
>>>> automatically fixed as attached input sections get final output addresses.)
>>>> Now that we know it doesn't work for many possible use cases. Then maybe we
>>>> want to eliminated dummy sections and directly define symbols as absolute
>>>> (or section) symbols.
>>>>
>>>
>>> Like I said, the main problem is calculating this "absolute value". How
>>> are you going to do this? Also, like George said, it is not correct to use
>>> absolute values for symbols defined inside output section description
>>>
>>
>> I think you don't need to calculate absolute values. We know the relative
>> distance from beginning of the current output section and the current "."
>> value, so we can create a DefinedSynthetic symbol with the output section
>> and the relative offset.
>>
>> Still have to deal with thunks, changing input section size, no?
>

Yes. But even with the current two-pass approach, I think the
above-mentioned logic should work.


>
>
>
>>
>>>
>>>> In this patch, you are trying to support assignments to symbols.
>>>> However, we eventually want to support something like this, too.
>>>>
>>>>   SECTIONS { .text : { foo.o(.text); . = ALIGN(128); bar.o(.text) } }
>>>>
>>>
>>> I do not see any problem in doing this. I think we use the same
>>> SymbolInputSection<ELFT> but with non-zero size, so proper layout will be
>>> calculated automatically in assignOffsets. Does this make sense?
>>>
>>
>> I don't think so. Does it work for more complicated inputs, such as
>>
>>   SECTIONS {
>>     .data { *(.data) }
>>     .text : { foo.o(.text); . += SIZEOF(.data); bar.o(.text) }
>>   }
>>
>> ?
>>
>
> I think making InputSectionBase<ELFT>::getSize() a virtual method will
> solve the problem, won't it?
>

It makes getSize() really complicated, no? If the expression is ". =
SIZEOF(.data) + ALIGN(100)", the input section need to understand the size
of the output .data section as well as the current dot value. Also, no
input sections have vtables now, so adding it only for getSize is probably
too much.


>
>
>>
>>
>>>
>>>> Looks like this doesn't fit to the current architecture. Currently, we
>>>> create a list of input sections and assign them addresses later. But in
>>>> order to process the above script, one pass would fit well. So I'm
>>>> wondering if we should merge LinkerScript::createSections and
>>>> LinkerScript::assignOffsets.
>>>>
>>>
>>> How can this be done? We have createThunks() in between.
>>>
>>
>> Yeah, we have Thunks. I haven't thought enough about that yet. But why we
>> can't create thunks earlier, even before createSections?
>>
>
> Is this possible? As far as I understand thunk contains jump. which can be
> between two input sections (or even output sections). Until you create full
> layout (like we do in createSections), it looks
> like a tough problem to solve.
>

Well, I believe it's at least technically doable (I'm not sure how hard it
is). When the Writer is called, all symbols are resolved, so all
relocations should know where they point to. That means it can be
determined whether they need thunks or not.


>
>
>>
>>
>>>>
>>>>>
>>>>> ================
>>>>> Comment at: ELF/LinkerScript.cpp:278
>>>>> @@ -176,3 +277,3 @@
>>>>>  // Process ONLY_IF_RO and ONLY_IF_RW.
>>>>>  template <class ELFT> void LinkerScript<ELFT>::filter() {
>>>>>    // In this loop, we remove output sections if they don't satisfy
>>>>> ----------------
>>>>> ruiu wrote:
>>>>> > Why did you have to make a change to this function?
>>>>> Two main reasons:
>>>>>
>>>>> 1) During filtering process some output sections may be removed. Those
>>>>> sections may contain symbols and SymbolInputSection object have already
>>>>> been created for them. To avoid crashes and/or creating dummy symbols I
>>>>> have to remove those virtual sections as well
>>>>>
>>>>> 2) The old implementation is not technically correct, because it
>>>>> removes only first output section found in name lookup. We're still using
>>>>> OutputSectionFactory<ELFT>, so we may have several sections with the same
>>>>> name.
>>>>>
>>>>> Another reason (though much less significant) is that one-by-one
>>>>> removal from std::vector must be slow, because it stores elements on
>>>>> continuous region of memory.
>>>>>
>>>>>
>>>>> https://reviews.llvm.org/D22683
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160802/f51f4799/attachment.html>


More information about the llvm-commits mailing list