[LLVMdev] Request for comments: TBAA on call

Fri Oct 11 00:16:12 PDT 2013

On Thu, Oct 10, 2013 at 9:34 PM, Chris Lattner <clattner at apple.com> wrote:

> On Oct 10, 2013, at 8:53 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
>
> The datastructures and algorithms we have are powerful enough to express
>> these sorts of things, and so long as a frontend abided by the rules, there
>> shouldn't be a problem.
>>
>
>
> My concerns are simply that whether designed this way or not, it ends up
> fairly inconsistent.
>
> For example, what would the plan be when a frontend does something like
> clang does now for C/C++ (generate type based TBAA), and also wants to do
> something like Filip is suggesting (which is also doable on C/C++ with
> simple frontend analysis)?
>
>
> There are two possible answers here, depending on the constraints:
>
> 1. The frontend author could unify them into one grand tree, like struct
> field TBAA does.
>

I would be impressed if you could unify the output of something like
andersens, and something like the current nested TBAA structure, without
massive loss of precision.

> 2. The frontend author could model them as two separate TBAA trees, which
> the TBAA machinery in LLVM handles conservatively.
>
> The conservative handling of different TBAA trees is critical, because you
> can LTO (for example) Javascript into C++ code in principle, each using
> their own TBAA structure.  LLVM is already well set for this, and it isn't
> an accident :-)
>
>
> You also run into issues with the existing metadata on loads/stores in
> this situation. It's again, entirely possible for a load to conflict with
> both a tbaa type, and a partitioned heap.    In Filip's scheme, there is no
> way to represent this.
>
>
> I'm not sure what you mean.
>

I mean it's not possible, in this scheme, to specify both to be used for
disambiguation.

Right now, you get *one* tbaa tag on the load.
So you have to choose which tree you use if you want if you choose option
#2 above.

You always have to choose one or the other, or else the lose the
disambiguation.

>  The compiler will handle this conservatively.
>

Only if you drop the tags/disambiguation capability from most things, or
i'm seriously confused.

let's take the following not-quite llvm-ir, but close enough.

!tbaa tree

0 -> everything

1 (parent 0) -> int
2 (parent 0 -> heap a

load <whatever> , !tbaa 1

call !tbaa.read 2

Note: You can't specify tbaa.read 2, 1 (at least as proposed).
the current machinery will say the call and the load never conflict.

It would seem you have to either make heap tags also children of
appropriate type tags, or only ever use one type of tree
What design for a split tree would work here?

I also can't think of a combined tree that would work without massive
precision loss.

>  If you have two different schemas existing in the same application (e.g.
> due to LTO or due to a language implementing two different non-unified
> models for some weird reason) then the compiler just doesn't draw any
> aliasing implications from references using two different schemas.
>
> Which is of course, bad.
You should in fact, be able to draw implications from both.  If the design
essentially only allows one to draw from one, that seems like a pretty big
flaw.

It is possible in principle to allow a load (for example) to have an
> arbitrary number of TBAA tags on it, which would solve this (allowing a
> single load to participate in multiple non-overlapping schemas) but I don't
> think it is worth the complexity at all.
>

Okay then, we'll agree to disagree. :)

This actually happens a lot of the time.  Restricting llvm to essentially
choosing one tree per load to use for disambiguation, as this scheme will
do, when it's not much more work to do better, seems shortsighted to me.

> -Chris
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131011/78548ad6/attachment.html>