[llvm-dev] Consider making build directories of the buildbots available

via llvm-dev llvm-dev at lists.llvm.org
Fri Apr 16 02:57:34 PDT 2021


A few points:

  *   This doesn't need to be an official LLVM distribution, it can just be a thing that is exists in good faith that you can download if you want but comes with no guarantees of any kind. A colleague who I was speaking to partially likened it to how people use Homebrew and install packages compiled by "someone". Package managers do implement additional hash checks, etc. but are not very different from what I am suggesting.
  *   We can leave it up to the user to make sure that they have the same version of Clang, Cmake, etc. as the buildbot that compiled the code (the buildbots can store this information in a text file and give it to the users). Like I said this would be a thing that exists with no guarantees of any kind. My idea is to just give people the resources that they need in order to contribute to LLVM and cannot get without spending money.
  *   While I do have access to SN-DBS, I'm not talking about myself personally here. People cannot be expected to have access to distributed build systems. Most university students for example will have thin and light ultrabooks that are easy to carry to lectures and most definitely won't be able to compile LLVM from scratch.


From: Philip Reames <listmail at philipreames.com>
Sent: 14 April 2021 23:39
To: Alexandre Ganea <alexandre.ganea at ubisoft.com>; Omer, Nabeel <Nabeel.Omer at sony.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Consider making build directories of the buildbots available

One bit of warning, build trees tend to have relative paths baked in.  From experience, trying to copy an llvm build tree from one location to another tends to not work well.

The best workaround I've seen for this type of thing is to use docker (or your container format of choice) to construct a "build image" with the full source and build trees, then update the source and trigger an (incremental) build.

After that's out of the way, you run into two major problems quickly.

1) Such build images tend to be gigantic.  (50+gb)  Bandwidth costs, storage costs, and download times add up *quickly*.
2) Their value as incremental build sources tend to age very quickly.  If you look at the commits to llvm, core headers get changed shocking often.  As such, if you're trying to follow ToT you quickly end up doing what is effectively a clean build anyways.  The only workflow which "somewhat works" is to develop against some recent snapshot, then rebase only at the very last and pay the cost of a full build.

I've played enough with ideas like this in the past to recommend you not go down this path.

An alternate approach I recommend - and use personally - is to do all builds on an AWS instance.  With some basic scripting (here's mine<https://github.com/preames/llvm-aws-builder>), you can do fast builds for a couple of dollars a day.  I've been working this was for about 6 months, and have found it dramatically easier than all the options I played with before.

On 4/14/21 1:53 PM, Alexandre Ganea via llvm-dev wrote:
That's an interesting idea. There are several issues to consider:

  1.  The build needs to be deterministic, if we want to share .OBJ files among LLVM developers. In essence, that means fully implementing [1]. I'm not sure how much of the LLVM codebase supports that.
Just the toolchain part is challenging to implement. We would need to ensure all users are using precisely the same compiler, the same linker, same cmake, same platform SDKs, etc. in order to expect good cache hits.
  2.  There's the security consideration. If anyone is to pull on the cached .OBJ files, you need to "trust" these .OBJ files in the first place. That means maybe restraining the list of cache "publishers" to the LLVM github group, and signing the .OBJs with a private key or something along those lines.
  3.  There's a file size consideration. My build folder for only { llvm, clang, lld } is about 40 GB. When using the ThinLTO cache, that goes over 100 GB. Still, build artefacts compress quite well (3:1 at least), and you'd probably pay the network price only once, then cache hits would be incremental.

I think the same idea could apply for distributing the compilation. It'd be interesting to have a public LLVM distributed compilation service. But how can we trust it? Even if we only compile on "trusted" machines, there's still a risk of attack. That's probably why these caching/remote compilation systems are only used inside an organization, which can guarantee trust (somehow). Since you're at Sony, perhaps you have access to a internal SN-DBS pool? [2]

[1] https://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html
[2] https://www.snsystems.com/tech-blog/2014/01/06/building-with-the-network/

De : llvm-dev <llvm-dev-bounces at lists.llvm.org><mailto:llvm-dev-bounces at lists.llvm.org> De la part de via llvm-dev
Envoyé : April 14, 2021 3:31 PM
À : llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Objet : Re: [llvm-dev] Consider making build directories of the buildbots available

Ah, apologies,  I think I wasn't clear. I am talking about making object files available in this manner so that people can download them and compile just their changes without having to compile all of LLVM, thus reducing the barrier to entry. As far as I am aware, the releases on Github do not contain object files.

Nabeel Omer

From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Neil Nelson via llvm-dev
Sent: 14 April 2021 20:05
To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] Consider making build directories of the buildbots available

Perhaps these pages may help.



Neil Nelson
On 4/14/21 12:53 PM, via llvm-dev wrote:
Hi LLVM devs,

As you are already aware, performing a clean build of LLVM requires considerable computing resources. This presents a barrier to entry for people who do not have access to large computers. Since the buildbots already regularly compile the LLVM codebase, making tarballs of their build directories available on a public facing server will dramatically reduce the barrier to entry. Is this something that the community is willing to consider?

Nabeel Omer


LLVM Developers mailing list

llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>



LLVM Developers mailing list

llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210416/a01b1c0d/attachment.html>

More information about the llvm-dev mailing list