[llvm-dev] [RFC] BOLT: A Framework for Binary Analysis, Transformation, and Optimization

Rafael Auler via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 11 21:40:34 PST 2021


Hi Mehdi and David,

Indeed, we share similar concerns. We do intend to move functionality of BOLT to live as a library, but the timeline is unclear. In fact, most of BOLT could live in a library already, it’s just a matter of moving some files into separate components. Instead of the files living in tools/llvm-bolt, most could just be moved under lib/something, and we already have a llvm-bolt.cpp file that instantiates the driver that coordinates the binary rewriting process, which is the entry point of BOLT as a library. People could already leverage this to use BOLT in different ways (for example, I wrote some time ago a different utility that runs the driver for two different binaries and compares the two – this was named boltdiff later).

My main reason for committing the project as a whole first, in the same way as flang did, though, (as a project merged into the monorepo), is because BOLT is already opensource for a while, and it is a 6-year old project with about 800 commits and 50K lines of code and we know we have people who forked the project and would like to contribute to it. If I commit into LLVM a different BOLT (not just rebased), then I (a) break or make it hard for any work on top of it from other contributors, (b) lose the original history or make it harder to preserve it.  That’s why I was going for a more smoother transition. I, as a developer, put value in the ability to blame and to understand why things were built a certain way, and not bringing BOLT’s history (in the same way as flang did) would mean we and the community loses a lot of context on the decisions of the project. And I guess that’s also the rationale for a monorepo, to have multiple projects merged together.

Because of that, I initially put bolt under /bolt, following flang’s model of merging the history so every developer has the right context. But the original location was under llvm/tools.

From: Mehdi AMINI <joker.eph at gmail.com>
Date: Thursday, March 11, 2021 at 1:32 PM
To: Rafael Auler <rafaelauler at fb.com>
Cc: Andrey Bokhanko <andreybokhanko at gmail.com>, llvm-dev <llvm-dev at lists.llvm.org>, Maksim Panchenko <maks at fb.com>
Subject: Re: [llvm-dev] [RFC] BOLT: A Framework for Binary Analysis, Transformation, and Optimization


On Wed, Mar 10, 2021 at 7:29 PM Rafael Auler via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi,

We finished rebasing BOLT on top of the LLVM monorepo and we verified that the new BOLT is performing as expected. To make BOLT work, we have a few changes to LLVM libs, which we will submit for review (first changes are already up: D97531<https://reviews.llvm.org/D97531>, D97830, D97899, D97898, D97891, D97830).

The plan for the initial BOLT commit is to include all its parts under a single directory, either /bolt or /llvm/tools/llvm-bolt. Once complete, this approach will allow people to directly contribute to the project and start using BOLT as part of LLVM. After this phase, we would like to start working with the community to break BOLT into separate components that will make it easier to build new tools based on the BOLT technology. As suggested by Propeller folks, we will split the disassembler component from the rest and make it possible to perform optimizations on low-level binary IR, which will likely have a serializable form.

It's still unclear, though, the proper location of BOLT in the monorepo. In our rebased branch, we are currently in a /bolt top-level folder in the monorepo, but are also considering /llvm/tools/llvm-bolt.

We are trying to work out the pros and cons of living in these locations and would appreciate community input. From our understanding, living under the /bolt top-level folder would give BOLT the following advantages:

- More independence to build a test infrastructure for BOLT. We could make check-bolt depend on LLD, for instance, if we need to build binaries on the fly to test BOLT features. Generating test inputs is a big problem for us, since we can't add real-world test binaries into the LLVM repo (which are awkward to track in the repo and also use a lot of space).
- We would share a similarity with other large projects such as flang and lld in location: these projects have their own top-level folder too.
- It would make more sense to live in a top-level folder because we intend to support building multiple tools (llvm-bolt, llvm-boltdiff, perf2bolt, merge-fdata). Living under llvm/tools is typically reserved for simpler single-binary projects.

Living in /llvm/tools/llvm-bolt, on the other hand, is perhaps more aligned with a longer-term goal of migrating BOLT to live as a lib under /llvm/lib and has the following advantages:

- Piggybacking on the LLVM release process, BOLT is released along with other llvm tools
- Piggybacking on buildbots being configured to build llvm tools, the project is more robust and well tested
- BOLT was originally developed to live under tools, and the project was named llvm-bolt to reflect that
- Being closer to LLVM will allow BOLT to migrate functionality more easily to llvm/lib

In general lllvm/tools are supposed to be entry points that exercises the LLVM Libraries. I'd be concerned about adding a tool/bolt that contains more than that (i.e. the entire implementation of the framework, instead of having it live in libraries). But it seems like you intend this as a step towards this? Is there a well defined plan to get there?

Is it difficult / overly involved to split things like the disassembler and other components in libraries that can live in `llvm/lib/...` and use them from tools/bolt/? Can this be done ahead of time and upstream these libraries first ahead of bolt itself?

Thanks,

--
Mehdi



Any thoughts on this?


From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> on behalf of Rafael Auler via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Date: Tuesday, January 26, 2021 at 4:54 PM
To: Andrey Bokhanko <andreybokhanko at gmail.com<mailto:andreybokhanko at gmail.com>>, llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>, Maksim Panchenko <maks at fb.com<mailto:maks at fb.com>>
Subject: Re: [llvm-dev] [RFC] BOLT: A Framework for Binary Analysis, Transformation, and Optimization
Hi Andrey,

We appreciate your interest and we look forward to collaborating. We are currently rebasing BOLT on top of LLVM trunk. Since it’s been a while since the last rebase, this is a bit of an involved task and we need to work through a rather lengthy list of conflicts. After we finish this and make sure BOLT works on the new repo, we plan to publish the list of commits and the merging diff so the community can evaluate a project merge proposal that works.

Regarding the project organization, remember BOLT was created before llvm monorepo. To address this, we are currently going for a similar approach to the one used by flang, re-editing all of our history on top of a new folder structure (root repo /bolt, similar to /flang), but trying to keep old commits mostly intact so we preserve project history -- I’m happy to change this to whatever makes more sense to the community. The least intrusive way to do this that I found was the flang merge approach. Now, because the project is not so small, we need a starting point that works in LLVM trunk, everything self-contained in /bolt with as few diffs as possible in /llvm, and then from there possibly work on evolving the project to other suggested organization (such as breaking up BOLT in a lib in llvm/lib). But first we wanted to start with the rebase that we knew would take some time.

That’s the gist of the current direction, thanks for pinging!

-Rafael

From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> on behalf of Andrey Bokhanko via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Date: Tuesday, January 26, 2021 at 2:31 AM
To: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>, Maksim Panchenko <maks at fb.com<mailto:maks at fb.com>>
Subject: Re: [llvm-dev] [RFC] BOLT: A Framework for Binary Analysis, Transformation, and Optimization
One more thing (to clarify my interest): my team is working on Golang
support in BOLT, and we're keen to open-source our developments
(pending approvals from the higher-ups). It's much more preferable for
us to contribute our code to LLVM project.

On Tue, Jan 26, 2021 at 1:26 PM Andrey Bokhanko
<andreybokhanko at gmail.com<mailto:andreybokhanko at gmail.com>> wrote:
>
> Hi Maksim,
>
> Any updates on adding BOLT to LLVM?
>
> If you need any help / support, feel free to ask. The World is waiting
> for BOLT! :-)
>
> Yours,
> Andrey
> ===
> Director
> Advanced Software Technology Lab
> Huawei
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/d46ddbc5/attachment.html>


More information about the llvm-dev mailing list