[llvm-dev] RFC: Resolving TBAA issues

Fri Aug 18 07:48:13 PDT 2017

Hello Hal,

 > What exactly do you do, or plan to do, for bit fields? A single
 > load/store can access more than one field. This is true for
 > other kinds of things too, but, in particularly common for bit
 > fields. (see also my comment below about !tbaa.struct).

Given how Clang currently generates code for bit field accesses, it 
seems the right thing to do is to annotate corresponding loads and 
stores as ones accessing whole storage units these bit fields map to. So 
this is the plan. With the proposed approach to TBAA we can do better, 
but the rest of LLVM is obviously not ready to this yet.

 > Do you have a Clang patch as well?

Yes, we have both Clang and LLVM patches. Not everything of what can be 
supported is already there, but at least we know we pass all 'check-all' 
tests with the new implementation in place of the scalar and struct-path 
ones and we see how current issues can be resolved with this new approach.

 > As in the current scheme, this group is a peer to <may_alias>
 > above?

Yes, <vtable_pointer> and <may_alias> are peers. Just like in the 
current implementation. This reminds me another aspect of the 
discussion: the C/C++ TBAA rules are not completely settled yet and we 
may want to change some bits as time flows. Or make them configurable. 
The proposed approach is designed with this possibility in mind and 
provides greater flexibility in this regard.

 > Of all of these groups above, only <union> has special
 > semantics, correct?

Yes, currently. But we consider type groups as one of the natural ways 
to extend the implementation to support other input languages and their 
dialects.

 >> For two given access sequences we can determine if the
 >> accessed objects are allowed to overlap by the rules of the
 >> input language. Here's the list of rules complete enough to
 >> support C and C++. Ellipsis elements denote sequences of zero
 >> or more elements.
 >
 > Which is presumed to match in both accesses, correct?

Yes, same-named elements shall match. Thanks for the catch!

 >> If neither of the given sequences contains the leading access
 >> type of the other, then they are allowed to overlap if the
 >> leading access type of one sequence is a direct or indirect
 >> field of the final access type of the other sequence and then
 >> the most generic access sequence consists of a single element,
 >> which is that final access type.

 > I'm not following the above paragraph. Can you please explain
 > from where this comes?

Consider this:

   struct A { int i; } *a;
   struct B { struct A a; };

   struct M { struct B b; };
   struct N { struct M m; } *n;

Now, access sequences for accesses 'n->m' and 'a->i' do not have common 
elements, but are still allowed to overlap as 'a' may be pointing to 
'n->m.b.a'.

 >> For a field node the first element refers to its type. The
 >> purpose of other elements is to make the field node unique.
 >> Their meaning is unspecified. Currently the other members for
 >> C and C++ are the field name, bit offset and bit size of the
 >> member, but

 > I suspect that we'll want to define a meaning to the
 > size/offset fields. One thing to think about is that, today, we
 > also have !tbaa.struct metadata. This is put on memcpy, etc. to
 > indicate what fields are defined (the rest is padding which the
 > memcpy does not need to actually copy). This is somewhat akin
 > to the bitfield problem (i.e. a single load can access multiple
 > fields in the bitfield). If we can represent that, maybe we can
 > use the same thing to replace !tbaa.struct as well. We could
 > specify size/offset in bits to handle bitfields.

Yes, I was thinking about representing parts of objects when working on 
SROA where we have to annotate accesses to slices as if they are 
accesses to whole objects. This is still a task in progress.

And by the way, another thing to think about is that in some cases we 
may need an instruction to be associated with more than one access 
sequence. For example, memcpy() calls that initialize and copy 
aggregates read and write something at the same time. So we probably 
should consider this as a potential solution to the slices problem.

As to !tbaa.struct, the only place where we analyze this tag is during 
simplification of the memcpy/memmove/memset() calls for aggregates. 
Since in the mainline we can only represent accesses to scalars, this 
code generates TBAA info for replacing loads and stores only if the 
aggregate comprises a single scalar field, if I read it correctly. 
Furthermore, it seems there are no tests that would fail if this code is 
thrown away completely. This effectively looks like there's no any 
!tbaa.struct at the moment.

Anyway, in our local patch we replaced it with plain !tbaa.

 > Sounds good. Can you be more specific about your experiments.
 > What kinds of code did you try?

We are playing with sources from llvm-project/test-suite, like 
spirit.cpp, for example. Unfortunately, Clang does not generate TBAA 
info in so many cases that even from this 600Kb source code with lots of 
memory accesses we only get about 150 TBAA metadata nodes and this makes 
measuring more or less meaningless. I would expect we to pay more 
attention to how much memory TBAA information consumes as we support 
more features, but in the same time I wouldn't expect a huge overhead as 
we only encode what is positively necessary.

Thanks,

--