[llvm-dev] RFC: Resolving TBAA issues

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Sun Aug 20 09:17:28 PDT 2017


On 08/18/2017 09:48 AM, Ivan A. Kosarev via llvm-dev wrote:
> Hello Hal,
>
> > What exactly do you do, or plan to do, for bit fields? A single
> > load/store can access more than one field. This is true for
> > other kinds of things too, but, in particularly common for bit
> > fields. (see also my comment below about !tbaa.struct).
>
> Given how Clang currently generates code for bit field accesses, it 
> seems the right thing to do is to annotate corresponding loads and 
> stores as ones accessing whole storage units these bit fields map to. 
> So this is the plan.

In your proposed scheme, exactly how to do you this? How do you 
represent a single load accessing multiple fields?

> With the proposed approach to TBAA we can do better, but the rest of 
> LLVM is obviously not ready to this yet.

What does this mean. How could it do better?

>
> > Do you have a Clang patch as well?
>
> Yes, we have both Clang and LLVM patches.

Can you provide the Clang patch as well?

> Not everything of what can be supported is already there, but at least 
> we know we pass all 'check-all' tests with the new implementation in 
> place of the scalar and struct-path ones and we see how current issues 
> can be resolved with this new approach.
>
> > As in the current scheme, this group is a peer to <may_alias>
> > above?
>
> Yes, <vtable_pointer> and <may_alias> are peers. Just like in the 
> current implementation. This reminds me another aspect of the 
> discussion: the C/C++ TBAA rules are not completely settled yet and we 
> may want to change some bits as time flows. Or make them configurable. 
> The proposed approach is designed with this possibility in mind and 
> provides greater flexibility in this regard.

I understand. I'm not convinced that we want flexibility, however. I 
believe that we want a frontend-language-independent set of rules, and a 
straightforward representation necessary to implement those rules, just 
one that is more expressive than what we have today. As TBAA has been, 
and as the IR has been, TBAA should continue to be inspired by the needs 
of language frontends, however, I have no interest in directly importing 
the semantics of other languages into LLVM itself (if not absolutely 
necessary). This is how we develop the LLVM IR as well: it is an 
independently-defined system, and where we need additional capabilities 
to meet the needs of clients, we extend the IR while keeping it an 
independently-defined system.

>
> > Of all of these groups above, only <union> has special
> > semantics, correct?
>
> Yes, currently. But we consider type groups as one of the natural ways 
> to extend the implementation to support other input languages and 
> their dialects.
>
> >> For two given access sequences we can determine if the
> >> accessed objects are allowed to overlap by the rules of the
> >> input language. Here's the list of rules complete enough to
> >> support C and C++. Ellipsis elements denote sequences of zero
> >> or more elements.
> >
> > Which is presumed to match in both accesses, correct?
>
> Yes, same-named elements shall match. Thanks for the catch!
>
> >> If neither of the given sequences contains the leading access
> >> type of the other, then they are allowed to overlap if the
> >> leading access type of one sequence is a direct or indirect
> >> field of the final access type of the other sequence and then
> >> the most generic access sequence consists of a single element,
> >> which is that final access type.
>
> > I'm not following the above paragraph. Can you please explain
> > from where this comes?
>
> Consider this:
>
>   struct A { int i; } *a;
>   struct B { struct A a; };
>
>   struct M { struct B b; };
>   struct N { struct M m; } *n;
>
> Now, access sequences for accesses 'n->m' and 'a->i' do not have 
> common elements, but are still allowed to overlap as 'a' may be 
> pointing to 'n->m.b.a'.

Okay, now I understand what you mean.

>
> >> For a field node the first element refers to its type. The
> >> purpose of other elements is to make the field node unique.
> >> Their meaning is unspecified. Currently the other members for
> >> C and C++ are the field name, bit offset and bit size of the
> >> member, but
>
> > I suspect that we'll want to define a meaning to the
> > size/offset fields. One thing to think about is that, today, we
> > also have !tbaa.struct metadata. This is put on memcpy, etc. to
> > indicate what fields are defined (the rest is padding which the
> > memcpy does not need to actually copy). This is somewhat akin
> > to the bitfield problem (i.e. a single load can access multiple
> > fields in the bitfield). If we can represent that, maybe we can
> > use the same thing to replace !tbaa.struct as well. We could
> > specify size/offset in bits to handle bitfields.
>
> Yes, I was thinking about representing parts of objects when working 
> on SROA where we have to annotate accesses to slices as if they are 
> accesses to whole objects. This is still a task in progress.

Sounds good. In the backend (i.e., at the SDAG/MI levels) we 
specifically track offsets into scalars for a similar reason. Tracking 
offsets here may make sense too.

>
> And by the way, another thing to think about is that in some cases we 
> may need an instruction to be associated with more than one access 
> sequence. For example, memcpy() calls that initialize and copy 
> aggregates read and write something at the same time. So we probably 
> should consider this as a potential solution to the slices problem.

I agree.

In the memcpy case, or more generally for argmemonly functions, I'd like 
to be able to mark specific arguments with metadata specifically. This 
will require an infrastructure enhancement to support metadata on 
function arguments, or some scheme for otherwise encoding metadata 
arguments.

>
> As to !tbaa.struct, the only place where we analyze this tag is during 
> simplification of the memcpy/memmove/memset() calls for aggregates. 

I believe that's correct.

> Since in the mainline we can only represent accesses to scalars, this 
> code generates TBAA info for replacing loads and stores only if the 
> aggregate comprises a single scalar field, if I read it correctly.

That appears to be correct. That should certainly be improved.

> Furthermore, it seems there are no tests that would fail if this code 
> is thrown away completely. This effectively looks like there's no any 
> !tbaa.struct at the moment.

By that standard, we'd throw away all parts of the compiler without 100% 
regression-test coverage. In such cases, we should add regression tests. 
I agree, however, that it is not currently used for much. The intent was 
that, when expanding memcpy (and friends), we could avoid copying 
padding bytes in cases where we expanded the intrinsics into individual 
loads/stores. We should still consider that use case in this context.

>
> Anyway, in our local patch we replaced it with plain !tbaa.
>
> > Sounds good. Can you be more specific about your experiments.
> > What kinds of code did you try?
>
> We are playing with sources from llvm-project/test-suite, like 
> spirit.cpp, for example. Unfortunately, Clang does not generate TBAA 
> info in so many cases that even from this 600Kb source code with lots 
> of memory accesses we only get about 150 TBAA metadata nodes and this 
> makes measuring more or less meaningless. I would expect we to pay 
> more attention to how much memory TBAA information consumes as we 
> support more features, but in the same time I wouldn't expect a huge 
> overhead as we only encode what is positively necessary.

Okay.

Thanks again,
Hal

>
> Thanks,
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list