[PATCH] D22683: [ELF] Symbol assignment within input section list

Wed Aug 3 06:21:55 PDT 2016

2016-08-03 4:47 GMT+03:00 Rafael Espíndola <rafael.espindola at gmail.com>:

> The patch as is crashes on linkerscript-provide-in-section.s. It is ok
> to turn into an error if something is wrong, but please don't crash.
>
> Cheers,
> Rafael
>
>
Rafael, this patch deletes that test case, it's strange that you've got
this crash.


>
> On 2 August 2016 at 18:23, Rui Ueyama <ruiu at google.com> wrote:
> > On Tue, Aug 2, 2016 at 3:15 PM, Eugene Leviant <evgeny.leviant at gmail.com
> >
> > wrote:
> >>
> >>
> >>
> >> 2016-08-03 0:55 GMT+03:00 Rui Ueyama <ruiu at google.com>:
> >>>
> >>> On Tue, Aug 2, 2016 at 2:45 PM, Eugene Leviant <
> evgeny.leviant at gmail.com>
> >>> wrote:
> >>>>
> >>>>
> >>>>
> >>>> среда, 3 августа 2016 г. пользователь Rui Ueyama написал:
> >>>>>
> >>>>> On Tue, Aug 2, 2016 at 3:00 AM, Eugene Leviant
> >>>>> <evgeny.leviant at gmail.com> wrote:
> >>>>>>
> >>>>>> evgeny777 added a comment.
> >>>>>>
> >>>>>> I think the main reason, we're using virtual input sections is that
> >>>>>> this the only way to calculate correct symbol offset. As you may
> know
> >>>>>> location counter is not incremented while we add input sections to
> output
> >>>>>> section, and the true size of input sections is known only after
> call to
> >>>>>> OutputSectionBase<ELFT>::assignOffsets().
> >>>>>>
> >>>>>> So if you suggest an algorithm, which can calculate correct symbol
> >>>>>> value (w/o using virtual input sections) in the case below:
> >>>>>>
> >>>>>>   .foo : { *(.foo); end_foo = .; *(.bar) }
> >>>>>>
> >>>>>> then we can probably switch to absolute symbols (BTW we can also use
> >>>>>> synthetic symbols - there is a little difference, if any).
> >>>>>> Another interesting question is what will happen if we define
> absolute
> >>>>>> symbol in shared object and reference it in executable? For example:
> >>>>>>
> >>>>>>   /* script for linking shared library */
> >>>>>>   SECTIONS { .text : { text_start = .; *(.text) } }
> >>>>>>
> >>>>>> So, when shared library is loaded by application, what value would
> >>>>>> text_start have, in case it is absolute? I don't know yet, but will
> try.
> >>>>>
> >>>>>
> >>>>> At first, I suggested you use empty dummy input sections to define
> >>>>> linker-script-defined symbols in the hope that in that way we don't
> need to
> >>>>> fix symbol addresses later (I was hoping that symbol addresses are
> >>>>> automatically fixed as attached input sections get final output
> addresses.)
> >>>>> Now that we know it doesn't work for many possible use cases. Then
> maybe we
> >>>>> want to eliminated dummy sections and directly define symbols as
> absolute
> >>>>> (or section) symbols.
> >>>>
> >>>>
> >>>> Like I said, the main problem is calculating this "absolute value".
> How
> >>>> are you going to do this? Also, like George said, it is not correct
> to use
> >>>> absolute values for symbols defined inside output section description
> >>>
> >>>
> >>> I think you don't need to calculate absolute values. We know the
> relative
> >>> distance from beginning of the current output section and the current
> "."
> >>> value, so we can create a DefinedSynthetic symbol with the output
> section
> >>> and the relative offset.
> >>>
> >> Still have to deal with thunks, changing input section size, no?
> >
> >
> > Yes. But even with the current two-pass approach, I think the
> > above-mentioned logic should work.
> >
> >>
> >>
> >>
> >>>>
> >>>>
> >>>>>
> >>>>> In this patch, you are trying to support assignments to symbols.
> >>>>> However, we eventually want to support something like this, too.
> >>>>>
> >>>>>   SECTIONS { .text : { foo.o(.text); . = ALIGN(128); bar.o(.text) } }
> >>>>
> >>>>
> >>>> I do not see any problem in doing this. I think we use the same
> >>>> SymbolInputSection<ELFT> but with non-zero size, so proper layout
> will be
> >>>> calculated automatically in assignOffsets. Does this make sense?
> >>>
> >>>
> >>> I don't think so. Does it work for more complicated inputs, such as
> >>>
> >>>   SECTIONS {
> >>>     .data { *(.data) }
> >>>     .text : { foo.o(.text); . += SIZEOF(.data); bar.o(.text) }
> >>>   }
> >>>
> >>> ?
> >>
> >>
> >> I think making InputSectionBase<ELFT>::getSize() a virtual method will
> >> solve the problem, won't it?
> >
> >
> > It makes getSize() really complicated, no? If the expression is ". =
> > SIZEOF(.data) + ALIGN(100)", the input section need to understand the
> size
> > of the output .data section as well as the current dot value. Also, no
> input
> > sections have vtables now, so adding it only for getSize is probably too
> > much.
> >
> >>
> >>
> >>>
> >>>
> >>>>>
> >>>>>
> >>>>> Looks like this doesn't fit to the current architecture. Currently,
> we
> >>>>> create a list of input sections and assign them addresses later. But
> in
> >>>>> order to process the above script, one pass would fit well. So I'm
> wondering
> >>>>> if we should merge LinkerScript::createSections and
> >>>>> LinkerScript::assignOffsets.
> >>>>
> >>>>
> >>>> How can this be done? We have createThunks() in between.
> >>>
> >>>
> >>> Yeah, we have Thunks. I haven't thought enough about that yet. But why
> we
> >>> can't create thunks earlier, even before createSections?
> >>
> >>
> >> Is this possible? As far as I understand thunk contains jump. which can
> be
> >> between two input sections (or even output sections). Until you create
> full
> >> layout (like we do in createSections), it looks
> >> like a tough problem to solve.
> >
> >
> > Well, I believe it's at least technically doable (I'm not sure how hard
> it
> > is). When the Writer is called, all symbols are resolved, so all
> relocations
> > should know where they point to. That means it can be determined whether
> > they need thunks or not.
> >
> >>
> >>
> >>>
> >>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> ================
> >>>>>> Comment at: ELF/LinkerScript.cpp:278
> >>>>>> @@ -176,3 +277,3 @@
> >>>>>>  // Process ONLY_IF_RO and ONLY_IF_RW.
> >>>>>>  template <class ELFT> void LinkerScript<ELFT>::filter() {
> >>>>>>    // In this loop, we remove output sections if they don't satisfy
> >>>>>> ----------------
> >>>>>> ruiu wrote:
> >>>>>> > Why did you have to make a change to this function?
> >>>>>> Two main reasons:
> >>>>>>
> >>>>>> 1) During filtering process some output sections may be removed.
> Those
> >>>>>> sections may contain symbols and SymbolInputSection object have
> already been
> >>>>>> created for them. To avoid crashes and/or creating dummy symbols I
> have to
> >>>>>> remove those virtual sections as well
> >>>>>>
> >>>>>> 2) The old implementation is not technically correct, because it
> >>>>>> removes only first output section found in name lookup. We're still
> using
> >>>>>> OutputSectionFactory<ELFT>, so we may have several sections with
> the same
> >>>>>> name.
> >>>>>>
> >>>>>> Another reason (though much less significant) is that one-by-one
> >>>>>> removal from std::vector must be slow, because it stores elements on
> >>>>>> continuous region of memory.
> >>>>>>
> >>>>>>
> >>>>>> https://reviews.llvm.org/D22683
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160803/4c9ea39c/attachment.html>