[llvm-dev] RFC: Representing unions in TBAA

Mon Aug 14 10:25:53 PDT 2017

On 08/14/2017 11:58 AM, Ivan A. Kosarev via llvm-dev wrote:
> Hello Steven, Hal and Daniel,
>
> Thanks a lot for your discussion; it really helps with summarizing 
> current TBAA issues and ways to resolve them.
>
> Do you guys know anything of the current status of the proposed 
> change? Steven, will you please let us know if the work is in progress 
> and if there is any ETA you can share?

I've been planning to get to it at some point, but I don't have an ETA 
for you.

>
> I'm asking because we are working on an alternative approach that not 
> only supports accesses to union members, bit fields, fields of 
> aggregate and union types, but also allows to represent accesses to 
> aggregates and unions the same way we do it for scalars so that 
> !tbaa.struct is replaced with plain !tbaa, meaning TBAA information 
> can be propagated uniformly regardless of types of accessed objects. 
> As a consequence, it supports identification of user types defined in 
> different translation units, even if some of them are written in C and 
> others are in C++. It also defines a set of language-neutral formal 
> rules that LLVM codegen follows to determine whether a given pair of 
> accesses are allowed to overlap by rules of the input language. As of 
> today, we know this implementation covers all currently supported TBAA 
> functionality reflected in the test suites and to test the new 
> functionality we have SROA improved to preserve TBAA information.
>
> The point is, our approach does not try to describe accesses as (type, 
> offset) pairs and instead represents access sequences explicitly 
> beginning from the base type followed by field descriptors, which is 
> what makes the approach so flexible. TypeBasedAAResult::Aliases() and 
> MDNode::getMostGenericTBAA() are a bit more complex than they used to 
> be (they actually use the same internal function), but rely 
> exclusively on linear scans of access sequences unless we have a 
> situation when have to check if one of the accessed types is the type 
> of a member of the other one, in which case it seems we just have to 
> traverse through fields recursively no matter what.
>
> So, I wonder if this or similar approaches have ever been considered 
> before and what are the cons, if there are any sounded. Do you think 
> it is worth to consider it now?

If you can describe the approach in detail, we'll certainly consider it.

  -Hal

>
> Thanks again,
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory