[PATCH] D24616: [ELF] Improve section GC for comdat groups

Eugene Leviant via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 16 03:14:20 PDT 2016


I've just made one simple experiment. Imagine you have simple C file:

int main(void) { return 0; }

Now you compile it with clang-3.8:

clang -ffunction-sections -fdata-sections -g -c main.c -o main.o

If you examine relocations, you'll see the following:

  Section (20) .rela.debug_line {
    Relocation {
      Offset: 0x2A
      Type: R_X86_64_64 (1)
      Symbol: .text.main (2)
      Addend: 0x0
    }
  }

As you may notice reloc pointing linkonce section exists, even though spec
says it shouldn't
A deeper look to symbol table tells us following:

readelf -aW main.o
...................................
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS main.c
 *    2: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 *
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
.................................

llvm-readobj -t main.o
.................................
  Symbol {
    Name:  (0)
    Value: 0x0
    Size: 0
    Binding: Local (0x0)
*    Type: Section (0x3)*
    Other: 0
    Section: .text.main (0x3)
  }
................................

So who is right: spec or clang?





2016-09-16 1:33 GMT+03:00 Rui Ueyama <ruiu at google.com>:

> https://docs.oracle.com/cd/E23824_01/html/819-0690/
> chapter7-26.html#scrolltoc says that
>
>
> *References to the sections comprising a group from sections outside of
> the group must be made through symbol table entries with STB_GLOBAL or
> STB_WEAK binding and section index SHN_UNDEF.*
>
>
> On Thu, Sep 15, 2016 at 3:30 PM, Eugene Leviant <evgeny.leviant at gmail.com>
> wrote:
>
>>
>>
>> пятница, 16 сентября 2016 г. пользователь Rui Ueyama написал:
>>
>>> On Thu, Sep 15, 2016 at 3:17 PM, Eugene Leviant <
>>> evgeny.leviant at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> пятница, 16 сентября 2016 г. пользователь Rui Ueyama написал:
>>>>
>>>>> On Thu, Sep 15, 2016 at 1:02 PM, Eugene Leviant <
>>>>> evgeny.leviant at gmail.com> wrote:
>>>>>
>>>>>> First of all, I need to identify this section, because, like I said,
>>>>>> it's being added only once to a single object file. When some group is seen
>>>>>> next time, all member sections are discarded. This means that relocations
>>>>>> in some object file may point to discarded sections. When you see this
>>>>>> discarded section in GC, you should (IMHO) find it counterpart and mark it
>>>>>> live.
>>>>>>
>>>>>
>>>>> No, I don't think relocations could point to sections that were
>>>>> removed because of comdat group deduplication.
>>>>>
>>>>
>>>> I'm sorry but this does happen. I can't share the real world example,
>>>> but you can try the test case which is a part of this patch (comdat-gc.s).
>>>> You'll get a crash in forEachSuccessor.
>>>>
>>>
>>> But isn't it out of the spec?
>>>
>>
>> Well, may be. What spec are you talking about?
>>
>>>
>>>
>>>>
>>>>
>>>>> All relocations to sections in comdat groups should be through
>>>>> undefined symbols. So, when the control reaches this part of code, all
>>>>> symbols should have already been resolved, or it will end up with a link
>>>>> error. You shouldn't have to "resolve" comdat groups by name again here.
>>>>>
>>>> What I'm doing now is:
>>>>>>
>>>>>> a) Get group signature, given object file and some input section
>>>>>> index. This is done using SectionGroupSig hash map.
>>>>>> b) Store all group signatures in hash set
>>>>>> c) Iterate object files and fetch all input sections, which are
>>>>>> members of comdat groups with signatures in this hash set.
>>>>>>
>>>>>> Now I'm trying to understand your suggestion. You suggest to keep
>>>>>> list of member sections in each member of the group, correct?
>>>>>> You cannot do so in discarded section, so you still need to find real
>>>>>> section, having some object file and section number in it, right?
>>>>>>
>>>>>>
>>>>>> 2016-09-15 22:45 GMT+03:00 Rui Ueyama <ruiu at google.com>:
>>>>>>
>>>>>>> Imagine any section that is in some comdat group have a GroupMembers
>>>>>>> vector, so that starting from any comdat group member section, you can
>>>>>>> reach all siblings in the same group. With that, all you have to do for a
>>>>>>> section S to make all its siblings live is to do `for (InputSectionData
>>>>>>> *Succ : S->GroupMember) Enque({Succ, 0});`. Doesn't it work?
>>>>>>>
>>>>>>> On Thu, Sep 15, 2016 at 12:32 PM, Eugene Leviant <
>>>>>>> evgeny.leviant at gmail.com> wrote:
>>>>>>>
>>>>>>>> evgeny777 added a comment.
>>>>>>>>
>>>>>>>> Let's elaborate the idea. The main problem is that symbol 'D'
>>>>>>>> inside resolveReloc() may point to InputSectionBase<ELFT>::Discarded.
>>>>>>>> This happens because comdat group is added to only one object file and
>>>>>>>> causes crash in GC, because forEachSuccessor implicitly casts Discarded to
>>>>>>>> InputSection<ELFT> and tries to fetch relocs from it. How this
>>>>>>>> 'GroupMembers' vector would help?
>>>>>>>>
>>>>>>>>
>>>>>>>> Repository:
>>>>>>>>   rL LLVM
>>>>>>>>
>>>>>>>> https://reviews.llvm.org/D24616
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160916/87fca08b/attachment.html>


More information about the llvm-commits mailing list