[PATCH] D17529: ELF: Implement ICF.

Mon Feb 22 18:45:01 PST 2016

On Mon, Feb 22, 2016 at 5:39 PM, Rui Ueyama <ruiu at google.com> wrote:

> On Mon, Feb 22, 2016 at 5:32 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>> silvas added a subscriber: silvas.
>> silvas added a comment.
>>
>> > The lacking feature is "safe" version of ICF. This merges all
>>
>> >  identical sections. That is not compatible with C/C++ language
>>
>> >  requirement that two distinct functions must not have the address.
>>
>> >  I think violation this rule breaks many C++ ABIs.
>>
>>
>> How does the COFF ICF avoid this problem?
>>
>
> ICF is on by default for COFF, so if something doesn't work with ICF, it
> doesn't work for COFF.
>

This hasn't been an issue in practice? Why do you think it will be an issue
with ELF?

>
>
>> > We have a different idea
>>
>> >  which David Majnemer suggested, which is to add NOPs at beginning
>>
>> >  of merged functions so that two or more pointers can have distinct
>>
>> >  values.
>>
>>
>> If we do this, we need to be careful to not misalign the function as
>> emitted by the compiler. E.g. even if we only need to insert a single nop
>> before the current section contents, we still want to grow the section by
>> the section's alignment.
>>
>>
>> ================
>> Comment at: ELF/ICF.cpp:16
>> @@ +15,3 @@
>> +//
>> +// ICF is theoretically a problem of reducing graphs by merging as many
>> +// identical subgraphs as possible if we consider sections as vertices
>> and
>> ----------------
>> Don't you use a partition-refinement approach instead of a graph
>> approach? Why mention graphs?
>>
>
> Well, this algorithm is a partition-refinement approach and at the same
> time works on a graph. They are just different two views of the same
> algorithm. Is it still confusing after you read through all this comments?
>

I thought the main insight of the partition refinement approach was that
describing the problem as "reducing graphs by merging as many identical
subgraphs as possible if we consider sections as vertices and relocations
as edges" is fundamentally wrong and fails to merge certain things that can
be merged? The partition refinement approach fundamentally does not operate
in terms of merging identical subgraphs of the original graph. It is sort
of confusing to talk about it in terms of graphs.

I would describe it as:
"We begin by optimistically putting all sections into a single equivalence
class. Then we apply a series of checks that split this initial equivalence
class into more and more refined equivalence classes based on the
properties by which a section can be distinguished. We begin by checking
that the section contents and flags are the same. This only needs to be
done once since these properties don't depend on the current equivalence
class assignment. Then we split the equivalence classes based on checking
that their relocations are the same, where relocation targets are compared
by their equivalence class, not the concrete section. This may need to be
done multiple times because as the equivalence classes are refined, two
sections that had a relocation target in the same equivalence class may now
target different equivalence classes, and hence these two sections must be
put in different equivalence classes (whereas in the previous iteration
they were not since the relocation target was the same)."

>
>
>> ================
>> Comment at: ELF/InputSection.h:182
>> @@ -170,1 +181,3 @@
>> +  // Used by ICF.
>> +  uint64_t GroupId = 0;
>>  };
>> ----------------
>> I know we aren't currently optimizing the sizeof of InputSection, but we
>> should be aware of it since we potentially allocate many.
>> Out of curiosity, have you measured if the extra memory pressure of this
>> GroupId causes any performance change?
>> It it has an adverse effect, we can always merge it with OutSecOff, since
>> I don't think we need both simultaneously. (or we can make sure that Repl
>> always points to a canonical group member).
>>
>
> I haven't measured. If you are interested, I can measure, but I think it
> wouldn't make measurable difference. I measured the performance penalty of
> a change that increases SymbolBody's size by 8 bytes, and I didn't find no
> measurable difference. (And # of symbols > # of sections.)
>

Were you compiling with -ffunction-sections -fdata-sections? I think it is
worth measuring for the -ffunction-sections -fdata-sections case.

-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160222/27244854/attachment.html>