[llvm-dev] [RFC] BOLT: A Framework for Binary Analysis, Transformation, and Optimization

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 11 22:05:17 PST 2021


On Thu, Mar 11, 2021 at 9:40 PM Rafael Auler <rafaelauler at fb.com> wrote:

> Hi Mehdi and David,
>
>
>
> Indeed, we share similar concerns. We do intend to move functionality of
> BOLT to live as a library, but the timeline is unclear. In fact, most of
> BOLT could live in a library already, it’s just a matter of moving some
> files into separate components. Instead of the files living in
> tools/llvm-bolt, most could just be moved under lib/something, and we
> already have a llvm-bolt.cpp file that instantiates the driver that
> coordinates the binary rewriting process, which is the entry point of BOLT
> as a library. People could already leverage this to use BOLT in different
> ways (for example, I wrote some time ago a different utility that runs the
> driver for two different binaries and compares the two – this was named
> boltdiff later).
>
>
>
> My main reason for committing the project as a whole first, in the same
> way as flang did, though, (as a project merged into the monorepo), is
> because BOLT is already opensource for a while, and it is a 6-year old
> project with about 800 commits and 50K lines of code and we know we have
> people who forked the project and would like to contribute to it. If I
> commit into LLVM a different BOLT (not just rebased), then I (a) break or
> make it hard for any work on top of it from other contributors, (b) lose
> the original history or make it harder to preserve it.  That’s why I was
> going for a more smoother transition. I, as a developer, put value in the
> ability to blame and to understand why things were built a certain way, and
> not bringing BOLT’s history (in the same way as flang did) would mean we
> and the community loses a lot of context on the decisions of the project.
> And I guess that’s also the rationale for a monorepo, to have multiple
> projects merged together.
>
>
>
> Because of that, I initially put bolt under /bolt, following flang’s model
> of merging the history so every developer has the right context. But the
> original location was under llvm/tools.
>

That makes sense, but something unclear to me is that refactoring it in
separate libraries in-tree right after merging it will also "break any work
on top of it" from people who forked it, wouldn't it? How would this be
managed after Bolt gets in-tree?

I guess a first step could be to produce a "snapshot" of the monorepo after
you rebase, so that folks can look at the actual proposal, the code
structure, and discuss the actual modifications that would be required
pre-merge and agree and the plan post-merge. How does it sound to you?

Best,

-- 
Mehdi





>
>
> *From: *Mehdi AMINI <joker.eph at gmail.com>
> *Date: *Thursday, March 11, 2021 at 1:32 PM
> *To: *Rafael Auler <rafaelauler at fb.com>
> *Cc: *Andrey Bokhanko <andreybokhanko at gmail.com>, llvm-dev <
> llvm-dev at lists.llvm.org>, Maksim Panchenko <maks at fb.com>
> *Subject: *Re: [llvm-dev] [RFC] BOLT: A Framework for Binary Analysis,
> Transformation, and Optimization
>
>
>
>
>
> On Wed, Mar 10, 2021 at 7:29 PM Rafael Auler via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi,
>
> We finished rebasing BOLT on top of the LLVM monorepo and we verified that
> the new BOLT is performing as expected. To make BOLT work, we have a few
> changes to LLVM libs, which we will submit for review (first changes are
> already up: D97531 <https://reviews.llvm.org/D97531>, D97830, D97899,
> D97898, D97891, D97830).
>
> The plan for the initial BOLT commit is to include all its parts under a
> single directory, either /bolt or /llvm/tools/llvm-bolt. Once complete,
> this approach will allow people to directly contribute to the project and
> start using BOLT as part of LLVM. After this phase, we would like to start
> working with the community to break BOLT into separate components that will
> make it easier to build new tools based on the BOLT technology. As
> suggested by Propeller folks, we will split the disassembler component from
> the rest and make it possible to perform optimizations on low-level binary
> IR, which will likely have a serializable form.
>
> It's still unclear, though, the proper location of BOLT in the monorepo.
> In our rebased branch, we are currently in a /bolt top-level folder in the
> monorepo, but are also considering /llvm/tools/llvm-bolt.
>
> We are trying to work out the pros and cons of living in these locations
> and would appreciate community input. From our understanding, living under
> the /bolt top-level folder would give BOLT the following advantages:
>
> - More independence to build a test infrastructure for BOLT. We could make
> check-bolt depend on LLD, for instance, if we need to build binaries on the
> fly to test BOLT features. Generating test inputs is a big problem for us,
> since we can't add real-world test binaries into the LLVM repo (which are
> awkward to track in the repo and also use a lot of space).
> - We would share a similarity with other large projects such as flang and
> lld in location: these projects have their own top-level folder too.
> - It would make more sense to live in a top-level folder because we intend
> to support building multiple tools (llvm-bolt, llvm-boltdiff, perf2bolt,
> merge-fdata). Living under llvm/tools is typically reserved for simpler
> single-binary projects.
>
> Living in /llvm/tools/llvm-bolt, on the other hand, is perhaps more
> aligned with a longer-term goal of migrating BOLT to live as a lib under
> /llvm/lib and has the following advantages:
>
> - Piggybacking on the LLVM release process, BOLT is released along with
> other llvm tools
> - Piggybacking on buildbots being configured to build llvm tools, the
> project is more robust and well tested
> - BOLT was originally developed to live under tools, and the project was
> named llvm-bolt to reflect that
> - Being closer to LLVM will allow BOLT to migrate functionality more
> easily to llvm/lib
>
>
>
> In general lllvm/tools are supposed to be entry points that exercises the
> LLVM Libraries. I'd be concerned about adding a tool/bolt that contains
> more than that (i.e. the entire implementation of the framework, instead of
> having it live in libraries). But it seems like you intend this as a step
> towards this? Is there a well defined plan to get there?
>
>
>
> Is it difficult / overly involved to split things like the disassembler
> and other components in libraries that can live in `llvm/lib/...` and use
> them from tools/bolt/? Can this be done ahead of time and upstream these
> libraries first ahead of bolt itself?
>
>
>
> Thanks,
>
>
>
> --
>
> Mehdi
>
>
>
>
>
>
> Any thoughts on this?
>
>
>
>
>
> *From: *llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Rafael
> Auler via llvm-dev <llvm-dev at lists.llvm.org>
> *Date: *Tuesday, January 26, 2021 at 4:54 PM
> *To: *Andrey Bokhanko <andreybokhanko at gmail.com>, llvm-dev <
> llvm-dev at lists.llvm.org>, Maksim Panchenko <maks at fb.com>
> *Subject: *Re: [llvm-dev] [RFC] BOLT: A Framework for Binary Analysis,
> Transformation, and Optimization
>
> Hi Andrey,
>
>
>
> We appreciate your interest and we look forward to collaborating. We are
> currently rebasing BOLT on top of LLVM trunk. Since it’s been a while since
> the last rebase, this is a bit of an involved task and we need to work
> through a rather lengthy list of conflicts. After we finish this and make
> sure BOLT works on the new repo, we plan to publish the list of commits and
> the merging diff so the community can evaluate a project merge proposal
> that works.
>
>
>
> Regarding the project organization, remember BOLT was created before llvm
> monorepo. To address this, we are currently going for a similar approach to
> the one used by flang, re-editing all of our history on top of a new folder
> structure (root repo /bolt, similar to /flang), but trying to keep old
> commits mostly intact so we preserve project history -- I’m happy to change
> this to whatever makes more sense to the community. The least intrusive way
> to do this that I found was the flang merge approach. Now, because the
> project is not so small, we need a starting point that works in LLVM trunk,
> everything self-contained in /bolt with as few diffs as possible in /llvm,
> and then from there possibly work on evolving the project to other
> suggested organization (such as breaking up BOLT in a lib in llvm/lib). But
> first we wanted to start with the rebase that we knew would take some time.
>
>
>
> That’s the gist of the current direction, thanks for pinging!
>
>
>
> -Rafael
>
>
>
> *From: *llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Andrey
> Bokhanko via llvm-dev <llvm-dev at lists.llvm.org>
> *Date: *Tuesday, January 26, 2021 at 2:31 AM
> *To: *llvm-dev <llvm-dev at lists.llvm.org>, Maksim Panchenko <maks at fb.com>
> *Subject: *Re: [llvm-dev] [RFC] BOLT: A Framework for Binary Analysis,
> Transformation, and Optimization
>
> One more thing (to clarify my interest): my team is working on Golang
> support in BOLT, and we're keen to open-source our developments
> (pending approvals from the higher-ups). It's much more preferable for
> us to contribute our code to LLVM project.
>
> On Tue, Jan 26, 2021 at 1:26 PM Andrey Bokhanko
> <andreybokhanko at gmail.com> wrote:
> >
> > Hi Maksim,
> >
> > Any updates on adding BOLT to LLVM?
> >
> > If you need any help / support, feel free to ask. The World is waiting
> > for BOLT! :-)
> >
> > Yours,
> > Andrey
> > ===
> > Director
> > Advanced Software Technology Lab
> > Huawei
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210311/9fb31583/attachment.html>


More information about the llvm-dev mailing list