[LLVMdev] IR extension proposal: bitset constants

Peter Collingbourne peter at pcc.me.uk
Wed Jan 28 10:57:43 PST 2015


Introducing a new module-level metadata format is just another kind of IR
extension, with the disadvantage of being untyped. We should be cutting down
on these types of things, not introducing a new one [1].

I feel that a global is the right design for this, as it represents data
that is conceptually in the "object file".

Peter

[1] https://groups.google.com/forum/#!topic/llvm-dev/u4hZUWAXQuY

On Wed, Jan 28, 2015 at 09:41:14AM -0800, Kostya Serebryany wrote:
> I would start from using module-level metadata.
> An IR extension might be a good idea once we show that
>   - the proposed indirect call protection mechanism is efficient and
> useful, and
>   - there are other use cases for the IR extension.
> 
> --kcc
> 
> 
> On Wed, Jan 28, 2015 at 2:57 AM, Sean Silva <chisophugis at gmail.com> wrote:
> 
> > Is there any way to accomplish this that doesn't require changes in every
> > part of the toolchain?
> >
> > What you describe here sounds really efficient, but for a change this
> > pervasive throughout the toolchain, I would at least like to see an attempt
> > done that does not require changing the toolchain. Then we can circle back
> > to changing the toolchain if that proves inadequate and with concrete
> > experience informing what really is going to require a toolchain change
> > (Maybe you've already done this? Please share.).
> >
> > -- Sean Silva
> >
> >
> > On Tue, Jan 27, 2015 at 9:07 PM, Peter Collingbourne <peter at pcc.me.uk>
> > wrote:
> >
> >> Hi all,
> >>
> >> I would like to propose a mechanism that allows IR modules to
> >> co-operatively
> >> build a pointer set corresponding to addresses within a given set of
> >> globals. The specific use case I have in mind is to provide a mechanism
> >> for a C++ program to efficiently verify (at each call site) that a vtable
> >> pointer is in the set of valid vtable pointers for the class or its
> >> derived
> >> classes. One way of doing this is for a toolchain component to build, for
> >> each
> >> class, a bit set that maps to the memory region allocated for the vtables,
> >> such that each 1 bit in the bit set maps to a valid vtable for that class,
> >> and lay out the vtables next to each other, to minimize the total size of
> >> the bit sets. Call sites would perform a range check followed by a load of
> >> the appropriate bit from the bit set.
> >>
> >> To give a concrete example, suppose we have the following classes:
> >>
> >> struct A { virtual void f(); };
> >> struct B : A { virtual void f(), g(); };
> >> struct C : A { virtual void f(), h(); };
> >>
> >> with the following vtables:
> >>
> >> _ZTV1A = { &A::rtti, &A::f };
> >> _ZTV1B = { &B::rtti, &B::f, &B::g };
> >> _ZTV1C = { &C::rtti, &C::f, &C::h };
> >>
> >> The set of valid vtables for static class A is {&_ZTV1A[1], &_ZTV1B[1],
> >> &_ZTV1C[1]}, for B is {&_ZTV1B[1]} and for C is {&_ZTV1C[1]}. The
> >> toolchain
> >> would lay out _ZTV1A, _ZTV1B and _ZTV1C consecutively, and generate bit
> >> sets
> >> like this:
> >>
> >> A = {0,1,0,1,0,0,1,0}
> >> B = {0,0,0,1,0,0,0,0}
> >> C = {0,0,0,0,0,0,1,0}
> >>
> >> A call site (say the static type of the call is A) will check whether the
> >> object's vtable falls in the range [_ZTV1A, _ZTV1C], and is sizeof(void
> >> *)-aligned. It would then load the (vtable-_ZTV1A)/sizeof(void *)'th entry
> >> in A's bitset, which will be 1 if the vtable is valid for A.
> >>
> >> We are now faced with a number of implementation questions: how do we
> >> represent
> >> these (potentially partial) bit sets in LLVM, and how do we arrange to lay
> >> out the globals and build the complete bit sets? Remember that classes B
> >> and C may be in different translation units, so we do not necessarily have
> >> a complete picture of A's bit set at A's compile time.
> >>
> >> What I propose is that the parts of the bit sets be represented as arrays
> >> of pointers, which will be resolved through some mechanism (either at
> >> LTO time, or at regular (non-LTO) link time) to bit sets corresponding to
> >> those addresses. This mechanism would also be responsible for laying out
> >> the referenced globals (i.e. the vtables) as close together as possible.
> >> We
> >> introduce a new flag on globals, the so-called "bitset" flag, which causes
> >> the global to be transformed using this mechanism. Such a global cannot be
> >> accessed in the normal way. It may only be accessed using an intrinsic
> >> that
> >> tests whether a given pointer is in the set.
> >>
> >> To return to our concrete example, a translation unit defining the vtable
> >> for A may also define the following global:
> >>
> >> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1A, 0, 1)]
> >>
> >> A translation unit defining B would define:
> >>
> >> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1B, 0, 1)]
> >> @_ZBS1B = appending bitset constant [1 x i8*] [gep(@_ZTV1B, 0, 1)]
> >>
> >> And C:
> >>
> >> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1C, 0, 1)]
> >> @_ZBS1C = appending bitset constant [1 x i8*] [gep(@_ZTV1C, 0, 1)]
> >>
> >> A later mechanism would combine the bitset globals, lay out the referenced
> >> globals and build the bit sets. In the LTO case, the IR linker can already
> >> do the combination step if the globals are given appending linkage. A
> >> later compiler pass (which we will call the "bitset lowering pass") would
> >> lay out
> >> the globals (by combining them into a single large global, and RAUWing the
> >> original globals with aliases that gep into the large global) and build
> >> the
> >> bit sets. In the non-LTO case, we could also invent an
> >> object-file-specific
> >> mechanism for the backend to tell the linker to combine and build the bit
> >> sets,
> >> for example by placing the bitset globals in a specifically named section,
> >> and place the referenced globals in individual sections so that the linker
> >> can rearrange them.
> >>
> >> As previously mentioned, an intrinsic would be used to test entries in the
> >> bit set at call sites. For example, the relevant IR for the following C++
> >> code:
> >>
> >> A *some_object = ...;
> >> some_object->f();
> >>
> >> might look like this:
> >>
> >> %vtable = load i8** %some_object
> >> %p = call i1 @llvm.bitset.test(i8* %vtable, i8* @_ZBS1A) readnone
> >> br i1 %p, label %continue, label %trap
> >>
> >> In the LTO case, such intrinsic calls would be lowered by the bitset
> >> lowering pass
> >> into the appropriate range/bit set check code, which might look like this:
> >> (pseudocode for 64-bit platform)
> >>
> >> if %vtable & 7 != 0 || %vtable < global_start_addr(_ZBS1A) || %vtable >
> >> global_end_addr(_ZBS1A) {
> >>   %p = 0
> >> } else {
> >>   %p = bitset(_ZBS1A)[(%vtable - global_start_addr(_ZBS1A)) >> 3]
> >> }
> >>
> >> In the non-LTO case, we would generate similar code, but with appropriate
> >> relocations so that the required constants (global start/end address,
> >> bitset
> >> address, mask, shift amount) would receive appropriate linker-determined
> >> values.
> >>
> >> The first step I propose to make in this direction is a patch that adds
> >> support for the bitset flag on globals, and introduces the bitset
> >> lowering pass. A
> >> later step would involve building backend support, and linker support in
> >> lld.
> >>
> >> Comments appreciated.
> >>
> >> Thanks,
> >> --
> >> Peter
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> >

-- 
Peter



More information about the llvm-dev mailing list