[llvm-dev] [RFC] Modernize CMake LLVM "Components"/libLLVM Facility

Stella Laurenzo via llvm-dev llvm-dev at lists.llvm.org
Mon Jan 4 12:17:43 PST 2021


On Mon, Jan 4, 2021 at 12:03 PM Tom Stellard <tstellar at redhat.com> wrote:

> On 1/4/21 11:41 AM, Stella Laurenzo wrote:
> >
> >
> > On Mon, Jan 4, 2021 at 11:04 AM Tom Stellard <tstellar at redhat.com
> > <mailto:tstellar at redhat.com>> wrote:
> >
> >     On 1/3/21 1:49 PM, Stella Laurenzo via llvm-dev wrote:
> >      > Hi folks, happy new year!
> >      >
> >      > *Proposal:*
> >      >
> >      >   * See comments at the top of LLVMComponents.cmake
> >      >
> >       <
> https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake
> >
> >      >     in my fork
> >      >
> >       <
> https://github.com/stellaraccident/llvm-project/tree/newcomponents>.
> >      >   * Draft phab: https://reviews.llvm.org/D94000
> >      >
> >      >
> >      > *Background:*
> >      > As I've been working on NPCOMP
> >      > <https://github.com/llvm/mlir-npcomp> trying to come up with a
> >     release
> >      > flow for MLIR derived Python projects (see py-mlir-release
> >      > <https://github.com/stellaraccident/mlir-py-release>), I've
> >     repeatedly
> >      > run into issues with how the LLVM build system generates shared
> >      > libraries. While the problems have been varied, I pattern match
> >     most of
> >      > them to a certain "pragmatic" nature to how
> >     components/libLLVM/libMLIR
> >      > have come to be: in my experience, you can fix most individual
> >     dynamic
> >      > linkage issues with another work-around, but the need for this
> >     tends to
> >      > be rooted in a lack of definition and structure to the libraries
> >      > themselves, causing various kinds of problems and scenarios that
> >     don't
> >      > arise if developed to stricter standards. (This isn't a knock on
> >     anyone
> >      > -- I know how these things tend to grow. My main observation is
> >     that I
> >      > think we have outgrown the ad-hoc nature of shared libraries in
> >     the LLVM
> >      > build now).
> >      >
> >      > I think I'm hitting this because reasonable Python projects and
> >     releases
> >      > pre-supposes a robust dynamic linkage story. Also, I use Windows
> >     and am
> >      > very aware that LLVM basically does not support dynamic linking on
> >      > Windows -- and cannot without more structure (and in my
> >     experience, this
> >      > structure would also benefit the robustness of dynamic linking on
> >     the
> >      > others).
> >      >
> >      > Several of us got together to discuss this in November
> >      >
> >     <
> https://llvm.discourse.group/t/meeting-notes-mlir-build-install-and-shared-libraries/2257
> >.
> >
> >      > We generally agreed that BUILD_SHARED_LIBS was closer to what we
> >     wanted
> >      > vs libLLVM/libMLIR, but the result is really only factored for
> >      > development (i.e. not every add_library should result in a shared
> >     object
> >      > -- the shared library surface should mirror public interface
> >     boundaries
> >      > and add_library mirrors private boundaries). The primary
> difference
> >      > between the two is:
> >      >
> >      >   * BUILD_SHARED_LIBS preserves the invariant that every
> translation
> >      >     unit will be "homed" in one library at link time (either
> >     .so/.dll or
> >      >     .a) and the system will never try to link together shared and
> >     static
> >      >     dependencies of the same thing (which is what libLLVM/libMLIR
> do
> >      >     today). It turns out that this is merely a good idea on most
> >      >     platforms but is the core requirement on native Windows
> >     (leaving out
> >      >     mingw, which uses some clever and dirty tricks to try to
> >     blend the
> >      >     worlds).
> >      >   * LLVM_BUILD_LLVM_DYLIB treats libLLVM.so as a "bucket" to throw
> >      >     things that might benefit from shared linkage, but end
> >     binaries end
> >      >     up also needing to link against the static libraries in case
> >     if what
> >      >     you want isn't in libLLVM.so. When this is done just right,
> >     it can
> >      >     work (on Unix) but it is very fragile and prone to multiple
> >      >     definition and other linkage issues that can be extremely
> hard to
> >      >     track down.
> >      >
> >      > *What I did:*
> >      >
> >      >  1. Well, first, I tried looking the other way for a few months
> and
> >      >     hoping someone else would fix it :)
> >      >  2. When I started trying to generalize some of the shared library
> >      >     handling for MLIR and NPCOMP, I noted that the
> >     LLVM_LINK_COMPONENTS
> >      >     (as in named groups of things) are in the right direction of
> >     having
> >      >     a structure to the libraries, and I found that I could
> actually
> >      >     rebase all of what the LLVM_LINK_COMPONENTS was trying to do
> >     on the
> >      >     same facility, relegating the existing LLVM_LINK_COMPONENTS
> to a
> >      >     name normalization layer on top of a more generic "LLVM
> >     Components"
> >      >     facility that enforces stricter layering and more control
> >     than the
> >      >     old libLLVM.so facility did.
> >      >  3. I rewrote it twice to progressively more modern CMake and was
> >     able
> >      >     to eliminate all of the ad-hoc dependency tracking in favor of
> >      >     straight-forward use of INTERFACE libraries and
> >     $<TARGET_PROPERTY>
> >      >     generator expressions for selecting static or dynamic
> component
> >      >     trees based on global flags and the presence (or absence) of
> >      >     per-executable LLVM_LINK_STATIC properties
> >      >      1. Note that since this is rooted only in CMake features and
> not
> >      >         LLVM macros, out of tree, non-LLVM projects should be
> able to
> >      >         depend on LLVM components in their own targets.
> >      >  4. I hacked up AddLLVM/LLVM-Build/LLVM-Config to (mostly) use
> >     the new
> >      >     facility (leaving out a few things that can be fixed but
> aren't
> >      >     conceptual issues), applied a bunch of fixes to the tree that
> >     were
> >      >     revealed by stricter checks and got all related tests passing
> for
> >      >     LLVM and MLIR (on X86 -- some mechanical changes need to be
> >     made to
> >      >     other targets) for both dynamic and static builds.
> >      >
> >      > *What I'd like to do:*
> >      >
> >      >   * Get some consensus that we'd like to improve things in this
> >     area and
> >      >     that the approach I'm taking makes sense. I can do a lot of
> the
> >      >     work, but I don't want to waste my time, and this stuff is
> >     fragile
> >      >     if we keep it in an intermediate state for too long (I'm
> already
> >      >     paying this price downstream).
> >      >   * Land LLVMComponents.cmake
> >      >
> >       <
> https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake
> >
> >      >     as the basis of the new facility.
> >
> >     Do you have a proposed list of components yet for LLVM?
> >
> >      >   * Finish implementing the "Redirection" feature that would
> >     allow us to
> >      >     emulate an aggregate libLLVM as it is today.
> >      >   * Start pre-staging the various stricter constraints to the
> >     build tree
> >      >     that will be needed to swap AddLLVM to use the new facility.
> >      >   * Rewrite component-related AddLLVM/LLVM-Build/LLVM-Config bits
> >     in a
> >      >     more principled way to use the new facility (or remove
> features
> >      >     entirely that are no longer needed) -- what I did in the
> >     above patch
> >      >     was just a minimal amount of working around for a POC.
> >      >   * Agree on whether we should try to have the two co-exist for a
> >     time
> >      >     or do a more clean break with the old.
> >      >   * Start applying the facility to downstream projects like MLIR
> >     and NPCOMP.
> >      >
> >
> >     It sounds like what you are proposing is BUILD_SHARED_LIBS=ON but
> with
> >     fewer total libraries, is this an accurate summary?
> >
> >
> > I think that is a reasonable summary for the level that most people care
> > about. It might be a bit pedantic, but what I'm aiming for is for us to
> > be able to define the shared library set to correspond with our notion
> > of component boundaries (follows public APIs), as that is what opens up
> > the ability to optimize them in the future (BUILD_SHARED_LIBS is just a
> > 1:1 add_library call -> shared library approach and leaks a lot of
> > private boundaries). Also, it preserves the ability for executables to
> > choose to link statically or dynamically, which is important for some
> > things (and likely will remain so, especially when considering
> downstream).
> >
>
> As part of this change, were you planning to explicitly define what the
> public APIs are for LLVM?  Currently, we just define this as
> 'everything' which is not great.  It would be a nice improvement if we
> could limit the number of exported symbols.  In addition to improving
> shared library performance, a smaller API would mean less fixes we have
> to reject from the stable branch due to API changes.
>

I had planned to make it more possible to do this at the granularity of a
component by use of a new library option (EXPORT_EXPLICIT_SYMBOLS). Then we
could crank through and tighten things up where appropriate. Doing it in
one step seems a bit to herculean for me, but I would like to lay down a
path to get there. Currently, the target components kind of do this by way
of an explicit check if building LIBLLVM and then setting visibility to
hidden for everything in the lib/Targets directory. Since these are
visibility-safe, I would remove this carve-out and just mark the component
libraries with EXPORT_EXPLICIT_SYMBOLS. We could then extend this pattern
to other components in less of an ad-hoc fashion than what we do now.


>
> >     I would prefer for any large change like this that we do not add any
> >     net
> >     new configuration options (meaning if we add a new option we should
> >     remove an old one)to LLVM as we already have too many.  Would this be
> >     able to replace BUILD_SHARED_LIBS=ON?
> >
> >
> > Completely agree in the end state. I would like to converge on one
> > configuration option that enables shared linking and then remove the
> > others. I suspect that downstreams may want to customize things a bit
> > more, but we should avoid adding those options to the extent possible in
> > favor of seeing if we can make the default way workable before
> fragmenting.
> >
> > Note that BUILD_SHARED_LIBS is a published way in the CMake ecosystem to
> > tell a project to build in shared library mode. If we get this all
> > fixed, we may still want to recognize when users set it and do the right
> > thing (i.e. make it more of an alias). This viewpoint would argue for
> > removing LLVM_BUILD_LLVM_DYLIB and just supporting BUILD_SHARED_LIBS
> > (but with new behavior). Either way, we should keep the variants to a
> > minimum.
> >
>
> I would be in favor of having BUILD_SHARED_LIBS being the only shared
> library related option that we support, if it produced the new behavior
> you described (and also libLLVM.so).  I know some people (not me though)
> use BUILD_SHARED_LIBS, because it reduces the build times when just
> changing a single file, so I think we would need to make sure that
> anything that replaces it does not regress build times too much.
>

+1 - I can verify but I think it will end up being ok. The fan out from
library -> component tends to not be more than 3-5x, and the largest
components link to ~10MiB. For the big ones that are already visibility
controlled, it *may* turn out to be a net savings in link time because
currently, when doing fine grained linking, way too much gets exported.
We'll see, but I suspect that worse case will not be too bad and is still
an order of magnitude less than the full static link that drives the
current costs.


>
> -Tom
>
> >
> >     - Tom
> >
> >      > *What I would need:*
> >      >
> >      >   * Help, testing and expertise. I am reasonably confident in my
> >      >     understanding of how to make shared libraries work and how to
> use
> >      >     CMake, but the legacy in LLVM here is deep -- I likely pattern
> >      >     matched some old features as no longer needed when they
> >     actually are
> >      >     (I am not clear at all on how much of LLVM-Config is still
> >     relevant).
> >      >   * Pointers to who the stakeholders are that I should be
> >     coordinating with.
> >      >
> >      > Comments?
> >      >
> >      > Thanks!
> >      > - Stella
> >      >
> >      > _______________________________________________
> >      > LLVM Developers mailing list
> >      > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >      > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >      >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210104/cc450441/attachment.html>


More information about the llvm-dev mailing list