[PATCH] D109192: [WIP/DNM] Support: introduce public API annotation support

Fri Sep 3 14:54:49 PDT 2021

compnerd added a comment.

In D109192#2983146 <https://reviews.llvm.org/D109192#2983146>, @aganea wrote:

> In D109192#2983041 <https://reviews.llvm.org/D109192#2983041>, @compnerd wrote:
>
>> In D109192#2982999 <https://reviews.llvm.org/D109192#2982999>, @rnk wrote:
>>
>>> In D109192#2982723 <https://reviews.llvm.org/D109192#2982723>, @compnerd wrote:
>>>
>>>> Some additional questions:
>>>>
>>>> - should we be encouraging of exporting entire classes or should we be more judicious in the visibility (e.g. attribute members)?  I'm not 100% certain that this works for non-dll-interface targets.
>>>
>>> I think you can give entire classes default visibility, and that will apply to all members. I think we should annotate classes because it will reduce the total number of annotations, and that will reduce the churn. For large, key classes that have too many private methods (Sema, ASTContext?) we can go back later and export only the public APIs, or something like that.
>>
>> This is sort of the direction that I had in mind.  The problem is, there are cases where I cannot get the desired behaviour correct.  Doing those where it doesn't just work seems like an easier path and one that could be refined later.  Its a bit unfortunate, but seems like a decent trade off.
>
> There's a 2^16 symbol limit per DLL on Windows, so exporting entire classes would potentially reach that limit much faster. From past experience, we started our DLL targets with class annotations, only to reach that limit, and fall back to function-only annotation later. It's a bit more tedious to mark functions independently, but also avoids the performance issues mentionned below. Exporting an entire class could introduce, by inadvertance, indirections at the caller site.

Yes, though, my reasoning for not wanting to annotate the non-inline functions only is less the churn and more the worry about the difference wrt vftable emission.  Although, now that I think about it, IIRC, MS ABI will always universally emit the vftable and so this point is moot - we should fallback to the anchors for the vtable emission for itanium targets.  Although, I think we might want to consider creating a macro for this now and eliding the anchor for MS ABI - that would save us a potentially large set of unnecessary exported symbols that do nothing on Windows.

> There's a also build time cost for exporting (too) many symbols, but maybe that less of a concern here (because it would only affect LLVM developers using this specific configuration).

I think that there is another aspect beyond build time costs - it inhibits potential optimizations with LTO.  Anything that can be fully internalized is a win in terms of potential LTO and DCE options.

> Regarding the API boundaries, or how to wisely chose those boundaries: I am wondering if there's a way to make use of the whole program analysis, to find the performance-critical paths. Any function in that path shouldn't be exported, to avoid the DLL thunk indirection I was mentionning. Ideally we should only export "high level" functions and keep performance-critical code internal.

That would be nice to do as a follow up I think.  I don't think that I could take that on, but would love to see it happen.

> I am also wondering how does this fit with the llvm-busybox <https://lists.llvm.org/pipermail/llvm-dev/2021-June/151321.html> proposal? That goes in a completly different direction, by coalescing all of llvm/clang/lld in a single executable. We could simply go along those lines: a single DLL per "project" (clang, llvm, lld) with the existing bintools as executable wrappers on the top of that. Is that the single-DSO config that @rnk mentions? Doing that seems almost a no-brainer in terms of defining the API boundaries.

I don't think that the multicall binary is particularly great, but, yes, this will solve the problem of the size at least - all of the tools should be effectively in a single DLL.  The frontends would be tiny.  I am not hardset on the original idea of 2/3 DLLs for LLVM and 1/2 DLLs for clang.  I could be convinced that 2 DLLs for LLVM (Support and LLVM) and 1 for clang is a decent enough split.  My intuition is that 3 for LLVM and 1/2 for Clang would be a pretty good compromise (LLVMSupport, LLVMTooling, LLVMCompiler for LLVM and potentially just Clang for clang).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109192/new/

https://reviews.llvm.org/D109192