[llvm-dev] [RFC] BOLT: A Framework for Binary Analysis, Transformation, and Optimization

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 11 23:33:22 PST 2021


Dropping Bolt to the top level directory sounds reasonable, but perhaps a
hybrid approach similar to what is mentioned by Medhi can be applied.
Basically Bolt first goes through a round of refactoring in github upstream
first with design that is close to the future structure in LLVM, and then
drops in as a monolithic piece initially. This will make future
restructuring much easier. There are other benefits: 1) it is a good
opportunity to clean up Bolt's internal APIs 2) It is time to beef up
unittests;  3) it makes code review easier.

David

On Thu, Mar 11, 2021 at 10:34 PM Chris Lattner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> On Mar 11, 2021, at 9:40 PM, Rafael Auler via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
> Hi Mehdi and David,
>
> Indeed, we share similar concerns. We do intend to move functionality of
> BOLT to live as a library, but the timeline is unclear. In fact, most of
> BOLT could live in a library already, it’s just a matter of moving some
> files into separate components. Instead of the files living in
> tools/llvm-bolt, most could just be moved under lib/something, and we
> already have a llvm-bolt.cpp file that instantiates the driver that
> coordinates the binary rewriting process, which is the entry point of BOLT
> as a library. People could already leverage this to use BOLT in different
> ways (for example, I wrote some time ago a different utility that runs the
> driver for two different binaries and compares the two – this was named
> boltdiff later).
>
> My main reason for committing the project as a whole first, in the same
> way as flang did, though, (as a project merged into the monorepo), is
> because BOLT is already opensource for a while, and it is a 6-year old
> project with about 800 commits and 50K lines of code and we know we have
> people who forked the project and would like to contribute to it. If I
> commit into LLVM a different BOLT (not just rebased), then I (a) break or
> make it hard for any work on top of it from other contributors, (b) lose
> the original history or make it harder to preserve it.  That’s why I was
> going for a more smoother transition. I, as a developer, put value in the
> ability to blame and to understand why things were built a certain way, and
> not bringing BOLT’s history (in the same way as flang did) would mean we
> and the community loses a lot of context on the decisions of the project.
> And I guess that’s also the rationale for a monorepo, to have multiple
> projects merged together.
>
> Because of that, I initially put bolt under /bolt, following flang’s model
> of merging the history so every developer has the right context. But the
> original location was under llvm/tools.
>
>
> As with others, I’m not very aware of the internal architecture of bolt,
> so take this with a grain of salt:
>
> From what I understand, I have a slight preference for starting this out
> as a /bolt top level “subproject”, because the code currently sounds
> monolithic.  As the implementation logic is refactored into more reusable
> units, those library can be cleanly movable within the monorepo, e.g. under
> the llvm-project/llvm directory if appropriate.
>
> The advantage of doing this is that nothing in the llvm-project/llvm repo
> can come to depend on the bolt code until and if it gets refactored.  This
> is also how things like LLDB started out (and it would be great for more of
> the reusable libraries in LLDB to be merged into LLVM over time).
>
> Does anyone have any concerns about this approach?
>
>
>
> Unrelatedly, I’d also love to see the llvm repository exploded a bit into
> more top level repos, e.g. splitting support/adt out to their own thing.
> It is also worth considering splitting the MC layer out to its own thing as
> well, LLVM IR and the mid-level optimizer into its own thing, and CodeGen
> and the targets into its own thing.
>
> The major constraint we need is that we want the dependences between
> top-level subproject to be a strong DAG between the subproject now and
> defensible into the future, and we don’t want minor evolution of the
> codebase to cause libraries to have to be moved around.  The benefit of
> splitting it up is easier to enforce layering, encouraging LLVM developers
> to work across subproject a bit more, and making it easier for subproject
> to depend on slices of “the big llvm directory”.
>
> -Chris
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210311/366159a3/attachment-0001.html>


More information about the llvm-dev mailing list