[LLVMdev] IR extension proposal: bitset constants

Peter Collingbourne peter at pcc.me.uk
Sat Jan 31 14:45:15 PST 2015


On Sat, Jan 31, 2015 at 02:17:22PM -0800, JF Bastien wrote:
> On Sat, Jan 31, 2015 at 2:07 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
> 
> > On Sat, Jan 31, 2015 at 11:35:01AM -0800, JF Bastien wrote:
> > > Trying to summarize all opinions expressed here: Peter is proposing an
> > > initial implementation that would only work with LTO. Folks seem put off
> > by
> > > this implementation affecting IR without having proven itself, and having
> > > shortcomings (as Jim pointed out). Kostya proposed going through metadata
> > > (and Chris kind of did too by mentioning tbaa), but Peter points out that
> > > this will make the implementation trickier.
> > >
> > > It sounds like going through metadata for the LTO-only proof-of-concept
> > > would be preferable, even if more complex. Peter, how more complex would
> > > that be?
> >
> > I was just thinking about how the metadata-based design for this might
> > work.
> >
> > We could have a named metadata global containing the list of bitset
> > entries.
> > Each entry would be a pair consisting of an identifier for the bitset (in
> > this case, the mangled name of the class) and a pointer into a global
> > (in this case, a valid vtable pointer).
> >
> > For example, this class hierarchy:
> >
> > class A { virtual void f(); };
> > class B : A { virtual void f(); };
> > class C : A { virtual void f(); };
> >
> > would have these bitsets:
> >
> > !llvm.bitsets = !{!0, !1, !2, !3, !4}
> >
> > !0 = !{!"1A", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1A, i32 0, i32
> > 2)}
> > !1 = !{!"1A", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1B, i32 0, i32
> > 2)}
> > !2 = !{!"1A", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1C, i32 0, i32
> > 2)}
> > !3 = !{!"1B", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1B, i32 0, i32
> > 2)}
> > !4 = !{!"1C", i8* getelementptr inbounds ([3 x i8*]* @_ZTV1C, i32 0, i32
> > 2)}
> >
> > The LLVM linker can already merge the contents of named metadata globals.
> >
> > The intrinsic for reading from bitsets would have this definition:
> >
> > declare i1 @llvm.bitset.test(i8*, metadata)
> >
> > and would be called like this:
> >
> >   %x = call i1 @llvm.bitset.test(i8* %p, metadata !{!"1A"})
> >
> > The disadvantage of this design is that metadata strings always have global
> > scope, so identically named classes in anonymous namespaces in different
> > translation units would receive the same bitset. (We can't just use the
> > vtable globals as bitset identifiers, because they may be erased by
> > globaldce.
> > There are probably various things we could do to force the globals to stick
> > around, if we wanted to go that way.)
> >
> 
> That's pretty unfortunate. Can you mangle anonymous namespace class names
> further for your purpose? With the module name?

Yes, I forgot that we could do that. Mangling with the source file name
would probably be enough for most purposes.

> Does that seem more reasonable to people?
> >
> 
> Seems reasonable as a first step. It doesn't seem like anyone is opposed to
> trying it out, and adding to IR if the metadata approach shows this to be a
> desirable feature.
> 
> The one think we need to ensure is that your metadata can be dropped by the
> optimizer and the code remains correct. I'm guessing no vtable would mean
> anything goes (not check)? That's bad security-wise, but it'll at least
> work. We may want to make sure nothing gets dropped through a debug flag,
> so that we can compile Chrome and be confident that all the checks we want
> are there.

So the flag would determine whether no bitset present means return 1 or
return 0? Sounds reasonable.

Thanks,
-- 
Peter



More information about the llvm-dev mailing list