[llvm-dev] Contributing Bazel BUILD files similar to gn

Stella Laurenzo via llvm-dev llvm-dev at lists.llvm.org
Thu Oct 29 21:22:45 PDT 2020


On Thu, Oct 29, 2020, 8:19 PM Eric Astor via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I *am* a Googler, though not directly involved with the teams that
> maintain the internal LLVM build. I happen to be a big fan of Bazel - and
> mostly build LLVM with the internal Bazel build, rather than the external
> CMake, because the better caching and remote-build-farm support is such an
> enormous help. (Also, I find the CMake build & build options kind of
> impenetrable.) However, I'm writing this particular email on my personal
> account, with personal resources, well past the close of business; my
> Google hat is firmly on the shelf, and I'm speaking as just an individual
> contributor.
>
> When I first started contributing to LLVM, *I was confused by the GN
> build's existence*. I didn't understand who was supposed to maintain it,
> whether I should use it, what the benefits were... you name it.
>
> I agree with some of the first comments on this thread. I'd suggest that
> we set aside the question of contributing Bazel BUILD files into the LLVM
> repository for now, and start by proposing a general policy around
> alternate/unsupported build systems in relation to the main repository. (GN
> can have an exception if needed.) The fact that the GN build is basically
> working, and doesn't confuse too many people, is a data point - but going
> from 1 alternate build system to 2 seems like a good point to pause and set
> an actual set of constraints and goals. Eventually, someone may want a
> third, and we should know what the guidelines are so we don't hash out the
> decision from scratch again!
>
> I don't think I could draft the RFC in question - I don't have enough
> experience with the community yet to judge what's really needed - but I'd
> be glad to help out with it. The idea should be to minimize the cost (to
> nearly zero) for both experienced LLVM contributors and new LLVM
> contributors. A few requirements I'd suggest, mostly put together from this
> thread:
>
>    - CMake should be able to build (and test!) everything the alternate
>    build system can, at all times.
>    - There must be a clear group who want to maintain the alternate build
>    system.
>    - The alternate build system's files should be isolated in a separate
>    directory, with a README explaining that this is an alternate build system
>    for LLVM, maintained by its own smaller community - and is not supported by
>    the community at large.
>    - The alternate build system must have independent buildbots, which do
>    not email the larger community; people can opt into being emailed about
>    this. (And should, if they're contributing to it!)
>    - If the buildbots are red for an extended time, we should put out a
>    call for maintainers to fix the issues; if not answered in a reasonable
>    time, we shouldn't be afraid to delete the alternate build system.
>
> I *do *also see the argument for the git submodule approach. It looks
> like a .gitmodules file would theoretically let a repository of Bazel BUILD
> files specify exactly which LLVM commit it currently tracks - and you could
> fetch the corresponding updates in both with a single command. I think that
> addresses the main point I noticed brought up on this side of the argument.
> Any RFC here probably needs to present pros & cons of both approaches.
> We'll need to hash those out in general discussion before people start
> looking for consensus, so people understand what they're deciding on.
>

Just one note on this...

But first, I am also a googler, and while I use bazel a lot, I don't see it
being any more than a niche anytime soon for a mainstream project such as
LLVM that has a wide deployment base, many variants/layers, cross
compilation, build/install splits, etc. Bazel just doesn't scale to the
level of differentiation and customization that is exploited for this scale
of an OSS project in the wild. It was born in a much less diverse
environment and carries that legacy forward (and seems like it will
continue to do so for the foreseeable future). And I say that as someone
who likely has enough years of experience in it that I could probably bend
it in those directions if it came down to it... But wouldn't consider it a
valuable use of time.

>From what I can tell, people are successful/happy using bazel when their
needs are not so diverse, and when they value org-scale consistency and
scalability of their eng teams. It's not the only way to get that, for
sure. Just a way that some choose, and some of those also choose/need to
take LLVM as a dependency. (I often find it too restrictive and choose
differently myself)

That interpretation would lead to an answer to "why would we do this?":
because it would help those people who use both bazel and LLVM to have an
easier time living at head with LLVM as a dependency. Most of those people
didn't actively choose bazel, and are in the same kind of mode of trying to
minimize their costs for a large piece of dev infra that isn't core to
their business/mission... Same as LLVM with cmake. Google internally and
Google aligned open source projects certainly fall into that category. I
can't speak for others.

As for the costs, I could go either way on whether this should live in the
monorepo. Even segmented into its own directory, the argument regarding the
cost of confusion/churn seems credible to me (even if the cost is deemed
worth it, I do see it as a cost that has merit to consider).

On to my note...

One other cost to consider is that if we have this outside of the monorepo,
and outside of the LLVM organization, we have a contribution barrier up
which firmly entrenches this as a "Google thing", and I don't think that is
a good thing for LLVM as a project... There will be a different committer
pool, different policy enforcement (such as accepting Google's CLA),
different comms channels, etc. Projects, both OSS and private, outside of
Google do use both bazel and LLVM, and it would be best, in my opinion, if
they could source and contribute all of the LLVM bits from the LLVM org,
including second tier build support, where it exists (and we should clearly
cordone this off as some kind of second tier).

In my mind, the best outcomes here involve deciding on a least harmful
place to maintain these second tier build setups, and my preference would
be that they be aligned with the llvm community/org vs on an island. I
don't have an opinion on whether this lands in the monorepo or a secondary
repo for second tier build setups. But I would like to see one of those
outcomes vs keeping this Google aligned/owned.



> Best,
> - Eric
>
> On Thu, Oct 29, 2020 at 10:40 PM Eric Christopher via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> On Thu, Oct 29, 2020 at 9:44 PM Johannes Doerfert via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> (see below)
>>>
>>>
>>> On 10/28/20 6:18 PM, Geoffrey Martin-Noble via llvm-dev wrote:
>>>  > Hi all,
>>>  >
>>>  > tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in
>>> a
>>>  > side-directory in the monorepo, similar to the gn build.
>>>  >
>>>  > Some of us have been working on open-source Bazel BUILD files for the
>>> LLVM
>>>  > Project. You may have seen us hanging out in the #build-systems
>>> discord
>>>  > channel. As you may know, Google uses Bazel internally and has
>>> maintained a
>>>  > Bazel BUILD of LLVM for years. Especially with the introduction of
>>> MLIR,
>>>  > we've got more and more OSS projects with a Bazel BUILD depending on
>>> LLVM
>>>  > (e.g. IREE <https://github.com/google/iree> and TensorFlow
>>>  > <https://github.com/tensorflow/tensorflow>). We're also not the only
>>> ones
>>>  > using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM that they've
>>> borrowed
>>>  > from TF
>>>  > <
>>> https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>.
>>>  > Each of these projects has to jump through some weird hoops to keep
>>> their
>>>  > version of the Bazel BUILD files in sync with the code, which
>>> requires some
>>>  > fragile combination of scripts and human intervention. Instead, we'd
>>> like
>>>  > to move general-purpose Bazel BUILD files into the LLVM Project
>>> monorepo.
>>>  > We expect to follow the model of the GN build where these will be
>>>  > maintained by interested contributors rather than expecting the
>>> general
>>>  > community to maintain them.
>>>  >
>>>  > To facilitate and test this we've been developing a standalone
>>> repository
>>>  > that just has the Bazel BUILD files. It symlinks together the
>>> directory
>>>  > trees on top of a submodule as we would need in the monorepo to to
>>> avoid
>>>  > in-tree BUILD files. The configuration is at
>>>  > https://github.com/google/llvm-bazel. We now have those in a good
>>> place and
>>>  > think they would be useful upstream.
>>>  >
>>>  > # Details
>>>  >
>>>  > ## What
>>>  >
>>>  > Bazel BUILD files for the LLVM, MLIR, and Clang (PR out for review
>>>  > <https://github.com/google/llvm-bazel/pull/72>) subprojects,
>>> potentially
>>>  > expanding to others, as needed. Basically everything currently at
>>>  > https://github.com/google/llvm-bazel.
>>>  >
>>>  > ## Where
>>>  >
>>>  > In https://github.com/google/llvm-bazel the BUILD files live in a
>>> single
>>>  > directory tree matching the structure of the overall llvm-project
>>>  > directory. For users, @llvm-project is a single Bazel repository
>>>  >
>>> <https://docs.bazel.build/versions/master/build-ref.html#repositories>
>>> that
>>>  > includes both LLVM and MLIR subprojects. To maintain this structure,
>>> we
>>>  > would probably want to put a `bazel` directory in the monorepo's utils
>>>  > directory <https://github.com/llvm/llvm-project/tree/master/utils>,
>>> which
>>>  > currently only contains a directory for arcanist. This is different
>>> from
>>>  > gn, which is under the LLVM subproject's utils directory
>>>  > <https://github.com/llvm/llvm-project/tree/master/llvm/utils/gn>. We
>>> could
>>>  > similarly put the Bazel BUILD files under llvm/utils/bazel but have
>>> them be
>>>  > for the entire llvm project (the subsets that are supported). This
>>> seems
>>>  > like an odd structure to me, but I know that the CMake build for LLVM
>>>  > also builds
>>>  > the other subprojects
>>>  >
>>> <
>>> https://github.com/llvm/llvm-project/blob/529ac33197f6/llvm/tools/CMakeLists.txt#L34-L41
>>> >,
>>>  > so maybe this would be preferable.
>>>  >
>>>  > Alternatively we could split each subproject into a separate Bazel
>>>  > repository and put the Bazel build files under each subproject. I
>>> think
>>>  > this fragments the configuration of the BUILD without much benefit.
>>>  >
>>>  > ## Configurations
>>>  >
>>>  > We currently have configurations for Linux GCC and Clang, MacOS GCC
>>> and
>>>  > Clang, and Windows MSVC. Support for other configurations can be added
>>>  > as-desired, but supporting all possible LLVM build configurations is
>>> not
>>>  > the goal.
>>>  >
>>>  > ## Support
>>>  >
>>>  > Support would be similar to the gn build. Contributors could
>>> optionally
>>>  > update the Bazel BUILD files as part of their patches, but would be
>>> under
>>>  > no obligation to do so.
>>>  >
>>>  > ## Preserving History
>>>  >
>>>  > I don't *think* the history of llvm-bazel is interesting enough to
>>> try to
>>>  > merge it into the monorepo and I was planning to submit this as a
>>> single
>>>  > patch, but please let me know if you disagree.
>>>  >
>>>  > ## Benefits to the community
>>>  >
>>>  >    -
>>>  >
>>>  >    Projects that depend on LLVM and use the Bazel build system can
>>> avoid
>>>  >    duplicating fragile effort. We'll spend more time contributing to
>>> LLVM
>>>  >    instead :-D
>>>  >    -
>>>  >
>>>  >    Bazel is stricter than CMake in many ways (e.g. it requires that
>>> even
>>>  >    header dependencies be declared) and can catch layering issues
>>> very easily.
>>>  >    There's even an optional layering_check feature we could turn on
>>> if its use
>>>  >    would benefit the community. (though currently the existing
>>> problematic
>>>  >    layering makes it a burden to maintain on our own). Even without
>>> that
>>>  >    additional check, as I've been keeping the Bazel build green, I've
>>> found
>>>  >    and fixed a number of layering issues in the past couple weeks
>>> (e.g.
>>>  >    https://reviews.llvm.org/rGb49787df9a
>>>  > <https://reviews.llvm.org/rGb49787df9a535f03761c340dca7ec3ec1155133d>
>>>  >    and https://reviews.llvm.org/rGc17ae2916c
>>>  > <https://reviews.llvm.org/rGc17ae2916ccf45a0c1717bd5f11598cc4fff342a
>>> >).
>>>  >
>>>  >
>>>  > Here's a patch <https://reviews.llvm.org/D90352> adding the Bazel
>>> build
>>>  > system. It's basically just `cp -r llvm-bazel/llvm-bazel
>>>  > llvm-project/utils/bazel`.
>>>
>>> Doesn't the last paragraph mean all benefits derived from this can be
>>> described either as:
>>>    (1) users do not need to clone the llvm-bazel git repo but get the
>>>        files in llvm-project, or
>>>    (2) "interested contributors" could send patches to llvm-project
>>>        instead of llvm-bazel to update the bazel build.
>>>
>>>
>> Absolutely. This could happen. The main reason behind this is to make
>> integating among a number of llvm based projects that use bazel (TF and
>> TF-based projects primarily, though it sounds like FB's internal process
>> would be helped as their system is similar to bazel).
>>
>>
>>> TBH, I have no interest in using bazel nor anything against it being
>>> merged per se. I just find it curious that we merge another build system
>>> "at no cost" for the community (I think I picked that up in the thread
>>> but I might have imagined the phrasing). I mean, there is always "a
>>> cost"* so it boils down to determine if the benefit is worth it.
>>>
>>>
>> As far as I can think the cost is...
>>
>>
>>>
>>> * i.a., people will assume we (=the LLVM community) maintain(s) a bazel
>>>    build, which can certainly be a benefit but also a cost", e.g., when
>>>    the build is not properly maintained, support is scarce, etc. and
>>>    emails come in complaining about it (not thinking of prior examples
>>>    here.)
>>>
>>>
>> ... this. If the system becomes a source of problems or user complaints
>> then I think it's absolutely reasonable to remove it.
>>
>> -eric
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201029/2ff0606c/attachment.html>


More information about the llvm-dev mailing list