<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 28, 2015 at 10:57 AM, Peter Collingbourne <span dir="ltr"><<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Introducing a new module-level metadata format is just another kind of IR<br>

extension, with the disadvantage of being untyped. We should be cutting down<br>

on these types of things, not introducing a new one [1].<br></blockquote><div><br></div><div>Hard to argue with that. </div><div>module-level metadata is a simple (though not the only) way to make a prototype and prove that the approach works. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

I feel that a global is the right design for this, as it represents data<br>

that is conceptually in the "object file".<br>

<br>

Peter<br>

<br>

[1] <a href="https://groups.google.com/forum/#!topic/llvm-dev/u4hZUWAXQuY" target="_blank">https://groups.google.com/forum/#!topic/llvm-dev/u4hZUWAXQuY</a><br>

<div class="HOEnZb"><div class="h5"><br>

On Wed, Jan 28, 2015 at 09:41:14AM -0800, Kostya Serebryany wrote:<br>

> I would start from using module-level metadata.<br>

> An IR extension might be a good idea once we show that<br>

>   - the proposed indirect call protection mechanism is efficient and<br>

> useful, and<br>

>   - there are other use cases for the IR extension.<br>

><br>

> --kcc<br>

><br>

><br>

> On Wed, Jan 28, 2015 at 2:57 AM, Sean Silva <<a href="mailto:chisophugis@gmail.com">chisophugis@gmail.com</a>> wrote:<br>

><br>

> > Is there any way to accomplish this that doesn't require changes in every<br>

> > part of the toolchain?<br>

> ><br>

> > What you describe here sounds really efficient, but for a change this<br>

> > pervasive throughout the toolchain, I would at least like to see an attempt<br>

> > done that does not require changing the toolchain. Then we can circle back<br>

> > to changing the toolchain if that proves inadequate and with concrete<br>

> > experience informing what really is going to require a toolchain change<br>

> > (Maybe you've already done this? Please share.).<br>

> ><br>

> > -- Sean Silva<br>

> ><br>

> ><br>

> > On Tue, Jan 27, 2015 at 9:07 PM, Peter Collingbourne <<a href="mailto:peter@pcc.me.uk">peter@pcc.me.uk</a>><br>

> > wrote:<br>

> ><br>

> >> Hi all,<br>

> >><br>

> >> I would like to propose a mechanism that allows IR modules to<br>

> >> co-operatively<br>

> >> build a pointer set corresponding to addresses within a given set of<br>

> >> globals. The specific use case I have in mind is to provide a mechanism<br>

> >> for a C++ program to efficiently verify (at each call site) that a vtable<br>

> >> pointer is in the set of valid vtable pointers for the class or its<br>

> >> derived<br>

> >> classes. One way of doing this is for a toolchain component to build, for<br>

> >> each<br>

> >> class, a bit set that maps to the memory region allocated for the vtables,<br>

> >> such that each 1 bit in the bit set maps to a valid vtable for that class,<br>

> >> and lay out the vtables next to each other, to minimize the total size of<br>

> >> the bit sets. Call sites would perform a range check followed by a load of<br>

> >> the appropriate bit from the bit set.<br>

> >><br>

> >> To give a concrete example, suppose we have the following classes:<br>

> >><br>

> >> struct A { virtual void f(); };<br>

> >> struct B : A { virtual void f(), g(); };<br>

> >> struct C : A { virtual void f(), h(); };<br>

> >><br>

> >> with the following vtables:<br>

> >><br>

> >> _ZTV1A = { &A::rtti, &A::f };<br>

> >> _ZTV1B = { &B::rtti, &B::f, &B::g };<br>

> >> _ZTV1C = { &C::rtti, &C::f, &C::h };<br>

> >><br>

> >> The set of valid vtables for static class A is {&_ZTV1A[1], &_ZTV1B[1],<br>

> >> &_ZTV1C[1]}, for B is {&_ZTV1B[1]} and for C is {&_ZTV1C[1]}. The<br>

> >> toolchain<br>

> >> would lay out _ZTV1A, _ZTV1B and _ZTV1C consecutively, and generate bit<br>

> >> sets<br>

> >> like this:<br>

> >><br>

> >> A = {0,1,0,1,0,0,1,0}<br>

> >> B = {0,0,0,1,0,0,0,0}<br>

> >> C = {0,0,0,0,0,0,1,0}<br>

> >><br>

> >> A call site (say the static type of the call is A) will check whether the<br>

> >> object's vtable falls in the range [_ZTV1A, _ZTV1C], and is sizeof(void<br>

> >> *)-aligned. It would then load the (vtable-_ZTV1A)/sizeof(void *)'th entry<br>

> >> in A's bitset, which will be 1 if the vtable is valid for A.<br>

> >><br>

> >> We are now faced with a number of implementation questions: how do we<br>

> >> represent<br>

> >> these (potentially partial) bit sets in LLVM, and how do we arrange to lay<br>

> >> out the globals and build the complete bit sets? Remember that classes B<br>

> >> and C may be in different translation units, so we do not necessarily have<br>

> >> a complete picture of A's bit set at A's compile time.<br>

> >><br>

> >> What I propose is that the parts of the bit sets be represented as arrays<br>

> >> of pointers, which will be resolved through some mechanism (either at<br>

> >> LTO time, or at regular (non-LTO) link time) to bit sets corresponding to<br>

> >> those addresses. This mechanism would also be responsible for laying out<br>

> >> the referenced globals (i.e. the vtables) as close together as possible.<br>

> >> We<br>

> >> introduce a new flag on globals, the so-called "bitset" flag, which causes<br>

> >> the global to be transformed using this mechanism. Such a global cannot be<br>

> >> accessed in the normal way. It may only be accessed using an intrinsic<br>

> >> that<br>

> >> tests whether a given pointer is in the set.<br>

> >><br>

> >> To return to our concrete example, a translation unit defining the vtable<br>

> >> for A may also define the following global:<br>

> >><br>

> >> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1A, 0, 1)]<br>

> >><br>

> >> A translation unit defining B would define:<br>

> >><br>

> >> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1B, 0, 1)]<br>

> >> @_ZBS1B = appending bitset constant [1 x i8*] [gep(@_ZTV1B, 0, 1)]<br>

> >><br>

> >> And C:<br>

> >><br>

> >> @_ZBS1A = appending bitset constant [1 x i8*] [gep(@_ZTV1C, 0, 1)]<br>

> >> @_ZBS1C = appending bitset constant [1 x i8*] [gep(@_ZTV1C, 0, 1)]<br>

> >><br>

> >> A later mechanism would combine the bitset globals, lay out the referenced<br>

> >> globals and build the bit sets. In the LTO case, the IR linker can already<br>

> >> do the combination step if the globals are given appending linkage. A<br>

> >> later compiler pass (which we will call the "bitset lowering pass") would<br>

> >> lay out<br>

> >> the globals (by combining them into a single large global, and RAUWing the<br>

> >> original globals with aliases that gep into the large global) and build<br>

> >> the<br>

> >> bit sets. In the non-LTO case, we could also invent an<br>

> >> object-file-specific<br>

> >> mechanism for the backend to tell the linker to combine and build the bit<br>

> >> sets,<br>

> >> for example by placing the bitset globals in a specifically named section,<br>

> >> and place the referenced globals in individual sections so that the linker<br>

> >> can rearrange them.<br>

> >><br>

> >> As previously mentioned, an intrinsic would be used to test entries in the<br>

> >> bit set at call sites. For example, the relevant IR for the following C++<br>

> >> code:<br>

> >><br>

> >> A *some_object = ...;<br>

> >> some_object->f();<br>

> >><br>

> >> might look like this:<br>

> >><br>

> >> %vtable = load i8** %some_object<br>

> >> %p = call i1 @llvm.bitset.test(i8* %vtable, i8* @_ZBS1A) readnone<br>

> >> br i1 %p, label %continue, label %trap<br>

> >><br>

> >> In the LTO case, such intrinsic calls would be lowered by the bitset<br>

> >> lowering pass<br>

> >> into the appropriate range/bit set check code, which might look like this:<br>

> >> (pseudocode for 64-bit platform)<br>

> >><br>

> >> if %vtable & 7 != 0 || %vtable < global_start_addr(_ZBS1A) || %vtable ><br>

> >> global_end_addr(_ZBS1A) {<br>

> >>   %p = 0<br>

> >> } else {<br>

> >>   %p = bitset(_ZBS1A)[(%vtable - global_start_addr(_ZBS1A)) >> 3]<br>

> >> }<br>

> >><br>

> >> In the non-LTO case, we would generate similar code, but with appropriate<br>

> >> relocations so that the required constants (global start/end address,<br>

> >> bitset<br>

> >> address, mask, shift amount) would receive appropriate linker-determined<br>

> >> values.<br>

> >><br>

> >> The first step I propose to make in this direction is a patch that adds<br>

> >> support for the bitset flag on globals, and introduces the bitset<br>

> >> lowering pass. A<br>

> >> later step would involve building backend support, and linker support in<br>

> >> lld.<br>

> >><br>

> >> Comments appreciated.<br>

> >><br>

> >> Thanks,<br>

> >> --<br>

> >> Peter<br>

> >> _______________________________________________<br>

> >> LLVM Developers mailing list<br>

> >> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

> >> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

> >><br>

> ><br>

> ><br>

> > _______________________________________________<br>

> > LLVM Developers mailing list<br>

> > <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

> ><br>

> ><br>

<br>

</div></div><span class="HOEnZb"><font color="#888888">--<br>

Peter<br>

</font></span></blockquote></div><br></div></div>