[LLVMdev] IR extension proposal: bitset constants
Kostya Serebryany
kcc at google.com
Wed Jan 28 09:41:14 PST 2015
I would start from using module-level metadata.
An IR extension might be a good idea once we show that
- the proposed indirect call protection mechanism is efficient and
useful, and
- there are other use cases for the IR extension.
--kcc
On Wed, Jan 28, 2015 at 2:57 AM, Sean Silva <chisophugis at gmail.com> wrote:
> Is there any way to accomplish this that doesn't require changes in every
> part of the toolchain?
>
> What you describe here sounds really efficient, but for a change this
> pervasive throughout the toolchain, I would at least like to see an attempt
> done that does not require changing the toolchain. Then we can circle back
> to changing the toolchain if that proves inadequate and with concrete
> experience informing what really is going to require a toolchain change
> (Maybe you've already done this? Please share.).
>
> -- Sean Silva
>
>
> On Tue, Jan 27, 2015 at 9:07 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
>
>> Hi all,
>>
>> I would like to propose a mechanism that allows IR modules to
>> co-operatively
>> build a pointer set corresponding to addresses within a given set of
>> globals. The specific use case I have in mind is to provide a mechanism
>> for a C++ program to efficiently verify (at each call site) that a vtable
>> pointer is in the set of valid vtable pointers for the class or its
>> derived
>> classes. One way of doing this is for a toolchain component to build, for
>> each
>> class, a bit set that maps to the memory region allocated for the vtables,
>> such that each 1 bit in the bit set maps to a valid vtable for that class,
>> and lay out the vtables next to each other, to minimize the total size of
>> the bit sets. Call sites would perform a range check followed by a load of
>> the appropriate bit from the bit set.
>>
>> To give a concrete example, suppose we have the following classes:
>>
>> struct A { virtual void f(); };
>> struct B : A { virtual void f(), g(); };
>> struct C : A { virtual void f(), h(); };
>>
>> with the following vtables:
>>
>> _ZTV1A = { &A::rtti, &A::f };
>> _ZTV1B = { &B::rtti, &B::f, &B::g };
>> _ZTV1C = { &C::rtti, &C::f, &C::h };
>>
>> The set of valid vtables for static class A is {&_ZTV1A[1], &_ZTV1B[1],
>> &_ZTV1C[1]}, for B is {&_ZTV1B[1]} and for C is {&_ZTV1C[1]}. The
>> toolchain
>> would lay out _ZTV1A, _ZTV1B and _ZTV1C consecutively, and generate bit
>> sets
>> like this:
>>
>> A = {0,1,0,1,0,0,1,0}
>> B = {0,0,0,1,0,0,0,0}
>> C = {0,0,0,0,0,0,1,0}
>>
>> A call site (say the static type of the call is A) will check whether the
>> object's vtable falls in the range [_ZTV1A, _ZTV1C], and is sizeof(void
>> *)-aligned. It would then load the (vtable-_ZTV1A)/sizeof(void *)'th entry
>> in A's bitset, which will be 1 if the vtable is valid for A.
>>
>> We are now faced with a number of implementation questions: how do we
>> represent
>> these (potentially partial) bit sets in LLVM, and how do we arrange to lay
>> out the globals and build the complete bit sets? Remember that classes B
>> and C may be in different translation units, so we do not necessarily have
>> a complete picture of A's bit set at A's compile time.
>>
>> What I propose is that the parts of the bit sets be represented as arrays
>> of pointers, which will be resolved through some mechanism (either at
>> LTO time, or at regular (non-LTO) link time) to bit sets corresponding to
>> those addresses. This mechanism would also be responsible for laying out
>> the referenced globals (i.e. the vtables) as close together as possible.
>> We
>> introduce a new flag on globals, the so-called "bitset" flag, which causes
>> the global to be transformed using this mechanism. Such a global cannot be
>> accessed in the normal way. It may only be accessed using an intrinsic
>> that
>> tests whether a given pointer is in the set.
>>
>> To return to our concrete example, a translation unit defining the vtable
>> for A may also define the following global:
>>
>> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1A, 0, 1)]
>>
>> A translation unit defining B would define:
>>
>> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1B, 0, 1)]
>> @_ZBS1B = appending bitset constant [1 x i8*] [gep(@_ZTV1B, 0, 1)]
>>
>> And C:
>>
>> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1C, 0, 1)]
>> @_ZBS1C = appending bitset constant [1 x i8*] [gep(@_ZTV1C, 0, 1)]
>>
>> A later mechanism would combine the bitset globals, lay out the referenced
>> globals and build the bit sets. In the LTO case, the IR linker can already
>> do the combination step if the globals are given appending linkage. A
>> later compiler pass (which we will call the "bitset lowering pass") would
>> lay out
>> the globals (by combining them into a single large global, and RAUWing the
>> original globals with aliases that gep into the large global) and build
>> the
>> bit sets. In the non-LTO case, we could also invent an
>> object-file-specific
>> mechanism for the backend to tell the linker to combine and build the bit
>> sets,
>> for example by placing the bitset globals in a specifically named section,
>> and place the referenced globals in individual sections so that the linker
>> can rearrange them.
>>
>> As previously mentioned, an intrinsic would be used to test entries in the
>> bit set at call sites. For example, the relevant IR for the following C++
>> code:
>>
>> A *some_object = ...;
>> some_object->f();
>>
>> might look like this:
>>
>> %vtable = load i8** %some_object
>> %p = call i1 @llvm.bitset.test(i8* %vtable, i8* @_ZBS1A) readnone
>> br i1 %p, label %continue, label %trap
>>
>> In the LTO case, such intrinsic calls would be lowered by the bitset
>> lowering pass
>> into the appropriate range/bit set check code, which might look like this:
>> (pseudocode for 64-bit platform)
>>
>> if %vtable & 7 != 0 || %vtable < global_start_addr(_ZBS1A) || %vtable >
>> global_end_addr(_ZBS1A) {
>> %p = 0
>> } else {
>> %p = bitset(_ZBS1A)[(%vtable - global_start_addr(_ZBS1A)) >> 3]
>> }
>>
>> In the non-LTO case, we would generate similar code, but with appropriate
>> relocations so that the required constants (global start/end address,
>> bitset
>> address, mask, shift amount) would receive appropriate linker-determined
>> values.
>>
>> The first step I propose to make in this direction is a patch that adds
>> support for the bitset flag on globals, and introduces the bitset
>> lowering pass. A
>> later step would involve building backend support, and linker support in
>> lld.
>>
>> Comments appreciated.
>>
>> Thanks,
>> --
>> Peter
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150128/2ed637b6/attachment.html>
More information about the llvm-dev
mailing list