[llvm-dev] PGO GC inefficiency due to link.exe's IMAGE_COMDAT_SELECT_ASSOCIATIVE limitation

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Wed May 26 13:51:29 PDT 2021


On Wed, May 26, 2021 at 12:54 AM Fāng-ruì Sòng via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I have studied the GC state of Windows (IR/frontend) PGO a bit.
> Below is my understanding. Hope folks can clarify it I make a mistake:)
>
> For PGO, we need cnts/data/vals/names sections. cnts/data are main
> metadata sections. vals is for value profiling.
> For Windows, the cnts section is named .lprfc$M and the data section
> is named .lprfd$M.
> The garbage collection story is unfortunate.
>
> If an IMAGE_COMDAT_SELECT_ASSOCIATIVE section defines an external
> symbol, MSVC link.exe may report a spurious duplicate symbol error
> (error LNK2005), even if the associative section would be discarded
> after handling the leader symbol.
> lld-link doesn't have this limitation.
> However, a portable implementation needs to work around MSVC link.exe.
>

Yes, that's pretty much how I remember this working.


> For a COMDAT .lprfd$M, its symbol must be external (linkonce_odr),
> otherwise references to a non-prevailing symbol would cause an error.
> Due to the limitation, .lprfd$M has to reside in its own COMDAT, no
> sharing with .lprfc$M.
>

Yes. I'm sure the instrumented code references the counters, so the
counters must be external. However, if the __profd_ symbol is not
referenced directly from anywhere, perhaps we could make it internal and
have it join the __profc_* comdat to reverse the edge. If there are
explicit __profd_ references from code, this won't work.

Different COMDAT groups mean that the liveness of one .lprfc$M does
> not make its associative .lprfd$M live.
> Since a .lprfd$M may be unreferenced, we have to conservatively assume
> all COMDAT .lprfd$M live.
> Since .lprfc$M input sections parallel .lprfd$M input sections, we
> have to conservatively assume all COMDAT .lprfc$M live.
> For an external symbol, we use a /INCLUDE: directive in .drectve to
> mark it as a GC root.
> As a result, .drectve may have many /INCLUDE: directives, just to work
> around the link.exe limitation.
> IIRC Chrome folks have found that the /INCLUDE: directives can cause
> huge memory usage.
>

Yes, I have wanted to implement an LLD-specific extension for this feature.
The general problem is that symbol names are long, and we should avoid
repeating them in the object file if at all possible. I think llvm.used and
__attribute__((used)) should be compatible with link.exe, but it would be
reasonable to make PGO an LLD-only feature to get these object file size
savings. Specifically, we can copy the design of .llvm_addrsig to mark
symbols as GC roots.


> Note: for ELF we can use R_*_NONE to establish an artificial
> dependency edge between two sections.
> I don't think PE-COFF provides a similar feature.
>

Yep, not so far as I am aware.


> So does link.exe accept feature requests? The external symbol
> limitation in an IMAGE_COMDAT_SELECT_ASSOCIATIVE section looks quite
> unfortunate...
>

I have a few points of contact at Microsoft, but there isn't a good general
way to submit these kinds of low-level ABI feature requests.

The way the feature is designed, it only really works well when the
associative data is static, unreferenced, and discovered only by section
concatenation and iteration. This supports very many use cases: debug info,
unwind info, dynamic initializers, asan global metadata, control flow guard
metadata, others. In the PGO use case, I think we would be violating a
hidden invariant that all the associative data should be static and
unreferenced.


> It seems that /OPT:REF has another inefficiency - it cannot discard
> non-COMDAT sections.
> IIUC Fuchsia folks found that GC on non-COMDAT functions can save a
> lot of space (40%? of PGO metadata sections). It is unfortunate that
> link.exe doesn't discard non-COMDAT sections...
>

This is not a major issue in practice because with -ffunction-sections /
-fdata-sections, everything is in a comdat section, they just all use
IMAGE_COMDAT_SELECT_NODUPLICATES.

I guess I really only have one idea here: find a way to avoid __profd_*
references, and make __profc_ the comdat leader.

----

Separately, I have noticed that PGO greatly inhibits GC for other reasons.
See the example from my last patch: https://reviews.llvm.org/D102818
For most linkonce_odr functions (so basically everything), the __profd_
global references the function, and then __profd_ is added to llvm.used.
This prevents GC from working on any linkonce_odr function, which is bad
for code size, so I removed the feature from most coverage builds.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210526/3a187b06/attachment.html>


More information about the llvm-dev mailing list