[llvm] r230394 - LowerBitSets: Introduce global layout builder.
Peter Collingbourne
peter at pcc.me.uk
Tue Feb 24 19:38:40 PST 2015
r230458
On Tue, Feb 24, 2015 at 03:32:32PM -0800, Kostya Serebryany wrote:
> Cool. I think this deserves to be mentioned in
> http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html
>
> On Tue, Feb 24, 2015 at 3:17 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
>
> > Author: pcc
> > Date: Tue Feb 24 17:17:02 2015
> > New Revision: 230394
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=230394&view=rev
> > Log:
> > LowerBitSets: Introduce global layout builder.
> >
> > The builder is based on a layout algorithm that tries to keep members of
> > small bit sets together. The new layout compresses Chromium's bit sets to
> > around 15% of their original size.
> >
> > Differential Revision: http://reviews.llvm.org/D7796
> >
> > Added:
> > llvm/trunk/test/Transforms/LowerBitSets/layout.ll
> > Modified:
> > llvm/trunk/docs/BitSets.rst
> > llvm/trunk/include/llvm/Transforms/IPO/LowerBitSets.h
> > llvm/trunk/lib/Transforms/IPO/LowerBitSets.cpp
> > llvm/trunk/unittests/Transforms/IPO/LowerBitSets.cpp
> >
> > Modified: llvm/trunk/docs/BitSets.rst
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/BitSets.rst?rev=230394&r1=230393&r2=230394&view=diff
> >
> > ==============================================================================
> > --- llvm/trunk/docs/BitSets.rst (original)
> > +++ llvm/trunk/docs/BitSets.rst Tue Feb 24 17:17:02 2015
> > @@ -17,8 +17,10 @@ global variable.
> > This will cause a link-time optimization pass to generate bitsets from the
> > memory addresses referenced from the elements of the bitset metadata. The
> > pass
> > will lay out the referenced globals consecutively, so their definitions
> > must
> > -be available at LTO time. An intrinsic, :ref:`llvm.bitset.test
> > <bitset.test>`,
> > -generates code to test whether a given pointer is a member of a bitset.
> > +be available at LTO time. The `GlobalLayoutBuilder`_ class is responsible
> > for
> > +laying out the globals efficiently to minimize the sizes of the underlying
> > +bitsets. An intrinsic, :ref:`llvm.bitset.test <bitset.test>`, generates
> > code
> > +to test whether a given pointer is a member of a bitset.
> >
> > :Example:
> >
> > @@ -64,3 +66,5 @@ generates code to test whether a given p
> > %d12 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32
> > 1)) ; returns 1
> > ret void
> > }
> > +
> > +.. _GlobalLayoutBuilder:
> > http://llvm.org/klaus/llvm/blob/master/include/llvm/Transforms/IPO/LowerBitSets.h
> >
> > Modified: llvm/trunk/include/llvm/Transforms/IPO/LowerBitSets.h
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/LowerBitSets.h?rev=230394&r1=230393&r2=230394&view=diff
> >
> > ==============================================================================
> > --- llvm/trunk/include/llvm/Transforms/IPO/LowerBitSets.h (original)
> > +++ llvm/trunk/include/llvm/Transforms/IPO/LowerBitSets.h Tue Feb 24
> > 17:17:02 2015
> > @@ -20,6 +20,7 @@
> >
> > #include <stdint.h>
> > #include <limits>
> > +#include <set>
> > #include <vector>
> >
> > namespace llvm {
> > @@ -73,6 +74,69 @@ struct BitSetBuilder {
> > BitSetInfo build();
> > };
> >
> > +/// This class implements a layout algorithm for globals referenced by
> > bit sets
> > +/// that tries to keep members of small bit sets together. This can
> > +/// significantly reduce bit set sizes in many cases.
> > +///
> > +/// It works by assembling fragments of layout from sets of referenced
> > globals.
> > +/// Each set of referenced globals causes the algorithm to create a new
> > +/// fragment, which is assembled by appending each referenced global in
> > the set
> > +/// into the fragment. If a referenced global has already been referenced
> > by an
> > +/// fragment created earlier, we instead delete that fragment and append
> > its
> > +/// contents into the fragment we are assembling.
> > +///
> > +/// By starting with the smallest fragments, we minimize the size of the
> > +/// fragments that are copied into larger fragments. This is most
> > intuitively
> > +/// thought about when considering the case where the globals are virtual
> > tables
> > +/// and the bit sets represent their derived classes: in a single
> > inheritance
> > +/// hierarchy, the optimum layout would involve a depth-first search of
> > the
> > +/// class hierarchy (and in fact the computed layout ends up looking a
> > lot like
> > +/// a DFS), but a naive DFS would not work well in the presence of
> > multiple
> > +/// inheritance. This aspect of the algorithm ends up fitting smaller
> > +/// hierarchies inside larger ones where that would be beneficial.
> > +///
> > +/// For example, consider this class hierarchy:
> > +///
> > +/// A B
> > +/// \ / | \
> > +/// C D E
> > +///
> > +/// We have five bit sets: bsA (A, C), bsB (B, C, D, E), bsC (C), bsD (D)
> > and
> > +/// bsE (E). If we laid out our objects by DFS traversing B followed by
> > A, our
> > +/// layout would be {B, C, D, E, A}. This is optimal for bsB as it needs
> > to
> > +/// cover the only 4 objects in its hierarchy, but not for bsA as it
> > needs to
> > +/// cover 5 objects, i.e. the entire layout. Our algorithm proceeds as
> > follows:
> > +///
> > +/// Add bsC, fragments {{C}}
> > +/// Add bsD, fragments {{C}, {D}}
> > +/// Add bsE, fragments {{C}, {D}, {E}}
> > +/// Add bsA, fragments {{A, C}, {D}, {E}}
> > +/// Add bsB, fragments {{B, A, C, D, E}}
> > +///
> > +/// This layout is optimal for bsA, as it now only needs to cover two
> > (i.e. 3
> > +/// fewer) objects, at the cost of bsB needing to cover 1 more object.
> > +///
> > +/// The bit set lowering pass assigns an object index to each object that
> > needs
> > +/// to be laid out, and calls addFragment for each bit set passing the
> > object
> > +/// indices of its referenced globals. It then assembles a layout from the
> > +/// computed layout in the Fragments field.
> > +struct GlobalLayoutBuilder {
> > + /// The computed layout. Each element of this vector contains a
> > fragment of
> > + /// layout (which may be empty) consisting of object indices.
> > + std::vector<std::vector<uint64_t>> Fragments;
> > +
> > + /// Mapping from object index to fragment index.
> > + std::vector<uint64_t> FragmentMap;
> > +
> > + GlobalLayoutBuilder(uint64_t NumObjects)
> > + : Fragments(1), FragmentMap(NumObjects) {}
> > +
> > + /// Add \param F to the layout while trying to keep its indices
> > contiguous.
> > + /// If a previously seen fragment uses any of \param F's indices, that
> > + /// fragment will be laid out inside \param F.
> > + void addFragment(const std::set<uint64_t> &F);
> > +};
> > +
> > } // namespace llvm
> >
> > #endif
> >
> > Modified: llvm/trunk/lib/Transforms/IPO/LowerBitSets.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/LowerBitSets.cpp?rev=230394&r1=230393&r2=230394&view=diff
> >
> > ==============================================================================
> > --- llvm/trunk/lib/Transforms/IPO/LowerBitSets.cpp (original)
> > +++ llvm/trunk/lib/Transforms/IPO/LowerBitSets.cpp Tue Feb 24 17:17:02 2015
> > @@ -118,6 +118,35 @@ BitSetInfo BitSetBuilder::build() {
> > return BSI;
> > }
> >
> > +void GlobalLayoutBuilder::addFragment(const std::set<uint64_t> &F) {
> > + // Create a new fragment to hold the layout for F.
> > + Fragments.emplace_back();
> > + std::vector<uint64_t> &Fragment = Fragments.back();
> > + uint64_t FragmentIndex = Fragments.size() - 1;
> > +
> > + for (auto ObjIndex : F) {
> > + uint64_t OldFragmentIndex = FragmentMap[ObjIndex];
> > + if (OldFragmentIndex == 0) {
> > + // We haven't seen this object index before, so just add it to the
> > current
> > + // fragment.
> > + Fragment.push_back(ObjIndex);
> > + } else {
> > + // This index belongs to an existing fragment. Copy the elements of
> > the
> > + // old fragment into this one and clear the old fragment. We don't
> > update
> > + // the fragment map just yet, this ensures that any further
> > references to
> > + // indices from the old fragment in this fragment do not insert any
> > more
> > + // indices.
> > + std::vector<uint64_t> &OldFragment = Fragments[OldFragmentIndex];
> > + Fragment.insert(Fragment.end(), OldFragment.begin(),
> > OldFragment.end());
> > + OldFragment.clear();
> > + }
> > + }
> > +
> > + // Update the fragment map to point our object indices to this fragment.
> > + for (uint64_t ObjIndex : Fragment)
> > + FragmentMap[ObjIndex] = FragmentIndex;
> > +}
> > +
> > namespace {
> >
> > struct LowerBitSets : public ModulePass {
> > @@ -485,27 +514,66 @@ bool LowerBitSets::buildBitSets(Module &
> > // Build the list of bitsets and referenced globals in this disjoint
> > set.
> > std::vector<MDString *> BitSets;
> > std::vector<GlobalVariable *> Globals;
> > + llvm::DenseMap<MDString *, uint64_t> BitSetIndices;
> > + llvm::DenseMap<GlobalVariable *, uint64_t> GlobalIndices;
> > for (GlobalClassesTy::member_iterator MI =
> > GlobalClasses.member_begin(I);
> > MI != GlobalClasses.member_end(); ++MI) {
> > - if ((*MI).is<MDString *>())
> > + if ((*MI).is<MDString *>()) {
> > + BitSetIndices[MI->get<MDString *>()] = BitSets.size();
> > BitSets.push_back(MI->get<MDString *>());
> > - else
> > + } else {
> > + GlobalIndices[MI->get<GlobalVariable *>()] = Globals.size();
> > Globals.push_back(MI->get<GlobalVariable *>());
> > + }
> > + }
> > +
> > + // For each bitset, build a set of indices that refer to globals
> > referenced
> > + // by the bitset.
> > + std::vector<std::set<uint64_t>> BitSetMembers(BitSets.size());
> > + if (BitSetNM) {
> > + for (MDNode *Op : BitSetNM->operands()) {
> > + // Op = { bitset name, global, offset }
> > + if (!Op->getOperand(1))
> > + continue;
> > + auto I = BitSetIndices.find(cast<MDString>(Op->getOperand(0)));
> > + if (I == BitSetIndices.end())
> > + continue;
> > +
> > + auto OpGlobal = cast<GlobalVariable>(
> > + cast<ConstantAsMetadata>(Op->getOperand(1))->getValue());
> > + BitSetMembers[I->second].insert(GlobalIndices[OpGlobal]);
> > + }
> > }
> >
> > - // Order bitsets and globals by name for determinism. TODO: We may
> > later
> > - // want to use a more sophisticated ordering that lays out globals so
> > as to
> > - // minimize the sizes of the bitsets.
> > + // Order the sets of indices by size. The GlobalLayoutBuilder works
> > best
> > + // when given small index sets first.
> > + std::stable_sort(
> > + BitSetMembers.begin(), BitSetMembers.end(),
> > + [](const std::set<uint64_t> &O1, const std::set<uint64_t> &O2) {
> > + return O1.size() < O2.size();
> > + });
> > +
> > + // Create a GlobalLayoutBuilder and provide it with index sets as
> > layout
> > + // fragments. The GlobalLayoutBuilder tries to lay out members of
> > fragments
> > + // as close together as possible.
> > + GlobalLayoutBuilder GLB(Globals.size());
> > + for (auto &&MemSet : BitSetMembers)
> > + GLB.addFragment(MemSet);
> > +
> > + // Build a vector of globals with the computed layout.
> > + std::vector<GlobalVariable *> OrderedGlobals(Globals.size());
> > + auto OGI = OrderedGlobals.begin();
> > + for (auto &&F : GLB.Fragments)
> > + for (auto &&Offset : F)
> > + *OGI++ = Globals[Offset];
> > +
> > + // Order bitsets by name for determinism.
> > std::sort(BitSets.begin(), BitSets.end(), [](MDString *S1, MDString
> > *S2) {
> > return S1->getString() < S2->getString();
> > });
> > - std::sort(Globals.begin(), Globals.end(),
> > - [](GlobalVariable *GV1, GlobalVariable *GV2) {
> > - return GV1->getName() < GV2->getName();
> > - });
> >
> > // Build the bitsets from this disjoint set.
> > - buildBitSetsFromGlobals(M, BitSets, Globals);
> > + buildBitSetsFromGlobals(M, BitSets, OrderedGlobals);
> > }
> >
> > return true;
> >
> > Added: llvm/trunk/test/Transforms/LowerBitSets/layout.ll
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LowerBitSets/layout.ll?rev=230394&view=auto
> >
> > ==============================================================================
> > --- llvm/trunk/test/Transforms/LowerBitSets/layout.ll (added)
> > +++ llvm/trunk/test/Transforms/LowerBitSets/layout.ll Tue Feb 24 17:17:02
> > 2015
> > @@ -0,0 +1,35 @@
> > +; RUN: opt -S -lowerbitsets < %s | FileCheck %s
> > +
> > +target datalayout = "e-p:32:32"
> > +
> > +; Tests that this set of globals is laid out according to our layout
> > algorithm
> > +; (see GlobalLayoutBuilder in include/llvm/Transforms/IPO/LowerBitSets.h).
> > +; The chosen layout in this case is a, e, b, d, c.
> > +
> > +; CHECK: private constant { i32, i32, i32, i32, i32 } { i32 1, i32 5, i32
> > 2, i32 4, i32 3 }
> > + at a = constant i32 1
> > + at b = constant i32 2
> > + at c = constant i32 3
> > + at d = constant i32 4
> > + at e = constant i32 5
> > +
> > +!0 = !{!"bitset1", i32* @a, i32 0}
> > +!1 = !{!"bitset1", i32* @b, i32 0}
> > +!2 = !{!"bitset1", i32* @c, i32 0}
> > +
> > +!3 = !{!"bitset2", i32* @b, i32 0}
> > +!4 = !{!"bitset2", i32* @d, i32 0}
> > +
> > +!5 = !{!"bitset3", i32* @a, i32 0}
> > +!6 = !{!"bitset3", i32* @e, i32 0}
> > +
> > +!llvm.bitsets = !{ !0, !1, !2, !3, !4, !5, !6 }
> > +
> > +declare i1 @llvm.bitset.test(i8* %ptr, metadata %bitset) nounwind readnone
> > +
> > +define void @foo() {
> > + %x = call i1 @llvm.bitset.test(i8* undef, metadata !"bitset1")
> > + %y = call i1 @llvm.bitset.test(i8* undef, metadata !"bitset2")
> > + %z = call i1 @llvm.bitset.test(i8* undef, metadata !"bitset3")
> > + ret void
> > +}
> >
> > Modified: llvm/trunk/unittests/Transforms/IPO/LowerBitSets.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Transforms/IPO/LowerBitSets.cpp?rev=230394&r1=230393&r2=230394&view=diff
> >
> > ==============================================================================
> > --- llvm/trunk/unittests/Transforms/IPO/LowerBitSets.cpp (original)
> > +++ llvm/trunk/unittests/Transforms/IPO/LowerBitSets.cpp Tue Feb 24
> > 17:17:02 2015
> > @@ -62,3 +62,30 @@ TEST(LowerBitSets, BitSetBuilder) {
> > }
> > }
> > }
> > +
> > +TEST(LowerBitSets, GlobalLayoutBuilder) {
> > + struct {
> > + uint64_t NumObjects;
> > + std::vector<std::set<uint64_t>> Fragments;
> > + std::vector<uint64_t> WantLayout;
> > + } GLBTests[] = {
> > + {0, {}, {}},
> > + {4, {{0, 1}, {2, 3}}, {0, 1, 2, 3}},
> > + {3, {{0, 1}, {1, 2}}, {0, 1, 2}},
> > + {4, {{0, 1}, {1, 2}, {2, 3}}, {0, 1, 2, 3}},
> > + {4, {{0, 1}, {2, 3}, {1, 2}}, {0, 1, 2, 3}},
> > + {6, {{2, 5}, {0, 1, 2, 3, 4, 5}}, {0, 1, 2, 5, 3, 4}},
> > + };
> > +
> > + for (auto &&T : GLBTests) {
> > + GlobalLayoutBuilder GLB(T.NumObjects);
> > + for (auto &&F : T.Fragments)
> > + GLB.addFragment(F);
> > +
> > + std::vector<uint64_t> ComputedLayout;
> > + for (auto &&F : GLB.Fragments)
> > + ComputedLayout.insert(ComputedLayout.end(), F.begin(), F.end());
> > +
> > + EXPECT_EQ(T.WantLayout, ComputedLayout);
> > + }
> > +}
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
--
Peter
More information about the llvm-commits
mailing list