[PATCH] D17529: ELF: Implement ICF.
Rui Ueyama via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 22 17:39:43 PST 2016
On Mon, Feb 22, 2016 at 5:32 PM, Sean Silva <chisophugis at gmail.com> wrote:
> silvas added a subscriber: silvas.
> silvas added a comment.
> > The lacking feature is "safe" version of ICF. This merges all
> > identical sections. That is not compatible with C/C++ language
> > requirement that two distinct functions must not have the address.
> > I think violation this rule breaks many C++ ABIs.
> How does the COFF ICF avoid this problem?
ICF is on by default for COFF, so if something doesn't work with ICF, it
doesn't work for COFF.
> > We have a different idea
> > which David Majnemer suggested, which is to add NOPs at beginning
> > of merged functions so that two or more pointers can have distinct
> > values.
> If we do this, we need to be careful to not misalign the function as
> emitted by the compiler. E.g. even if we only need to insert a single nop
> before the current section contents, we still want to grow the section by
> the section's alignment.
> Comment at: ELF/ICF.cpp:16
> @@ +15,3 @@
> +// ICF is theoretically a problem of reducing graphs by merging as many
> +// identical subgraphs as possible if we consider sections as vertices and
> Don't you use a partition-refinement approach instead of a graph approach?
> Why mention graphs?
Well, this algorithm is a partition-refinement approach and at the same
time works on a graph. They are just different two views of the same
algorithm. Is it still confusing after you read through all this comments?
> Comment at: ELF/InputSection.h:182
> @@ -170,1 +181,3 @@
> + // Used by ICF.
> + uint64_t GroupId = 0;
> I know we aren't currently optimizing the sizeof of InputSection, but we
> should be aware of it since we potentially allocate many.
> Out of curiosity, have you measured if the extra memory pressure of this
> GroupId causes any performance change?
> It it has an adverse effect, we can always merge it with OutSecOff, since
> I don't think we need both simultaneously. (or we can make sure that Repl
> always points to a canonical group member).
I haven't measured. If you are interested, I can measure, but I think it
wouldn't make measurable difference. I measured the performance penalty of
a change that increases SymbolBody's size by 8 bytes, and I didn't find no
measurable difference. (And # of symbols > # of sections.)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits