[PATCH] D27415: [ELF] - Replace MergeOutputSection with synthetic input section MergeSection.

Fri Jan 27 03:02:50 PST 2017

grimar added a comment.

In https://reviews.llvm.org/D27415#657878, @ruiu wrote:

> I probably do not understand what you are trying to solve.
>
> Currently, LLD merges two mergeable input sections if they have the same name, types and section flags. But, you are saying that that is not always correct, right?
>
> Can you briefly describe the exact semantics this patch is trying to implement, and then why you think that is the better behavior?

I am trying to solve issue I am demonstrating at https://reviews.llvm.org/D29217 page.
It has:

  .section .aaa.1,"a"
  .byte 11
  .section .aaa.2,"a"
  .byte 22

  .section .bbb.1,"aMS", at progbits,1
  .asciz "foo"
  .section .bbb.2,"aMS", at progbits,1
  .asciz "foo"

  .section .ccc.1,"a"
  .byte 33
  .section .ccc.2,"a"
  .byte 44

It has also symbols assignments to A and B that should mark the start/end of .ccc :

  .rodata : { *(.aaa.*) *(.bbb.*) A = .; *(.ccc.*) B = .; }

**1. LLD currently (clean head revision) do: **

It creates 2 output .rodata sections:
.rodata (.aaa.1 .aaa.2 .ccc.1 .ccc.2)
.rodata (.bbb.1 .bbb.2)

And because of that it assigns values of A and B wrong currently. I think the cleanest way to fix is implement synthetic merge section instead of MergeOutputSection.
(this is what this patch was about, right ?)
That way output should be single .rodata section which consist of [ .aaa.1 .aaa.2, synthetic section holding <.bbb.1 and .bbb.2>, .ccc.1, .ccc.2]. And symbols will be assigned correctly then.
Output will probably be equal to bfd/gold, btw.

**2. Latest diff of this patch do:**

It creates synthetic sections for mergeable sections early now. Before passing them to script. After that we have next input sections available to work with on linerscript side:

[.aaa.1 .aaa.2 **SynteticMergeSection**(.bbb.1) **SynteticMergeSection**(.bbb.2) .ccc.1 .ccc.2]

Since synthetic merge sections are already created, script just places all above in a single .rodata section. We loose string merging optimization here, because have 2 pre-created SynteticMergeSections.
There is no way to merge 2 synthetic mergable sections together on this step (I do not think we want to implement this too).
And we can not create single .bbb section for holding .bbb.1 and .bbb.2 early before proccessing linkerscript commands, because we do not know it will want to put them together.

**3. Patch I want to do basing on what diff2 of this patch already did:**

I want to create synthetic merge sections later. For linkerscript step it should be done after script prepares input sections list to create output sections. That was done in diff2.
That way we have inputs:
[.aaa.1 .aaa.2 **MergeInputSection(.bbb.1)** **MergeInputSection(.bbb.2)** .ccc.1 .ccc.2]

Script will create single SynteticMergeSection for mergable input sections ind convert inputs to:
[.aaa.1 .aaa.2 **SynteticMergeSection(.bbb.1, .bbb.2)** .ccc.1 .ccc.2]

And we will end up with single .rodata with all optimizations working.

https://reviews.llvm.org/D27415