[LLVMdev] PROPOSAL: struct-access-path aware TBAA
Daniel Berlin
dberlin at dberlin.org
Tue Mar 12 17:40:35 PDT 2013
On Tue, Mar 12, 2013 at 3:44 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>
> On 3/11/13 1:17 PM, Daniel Berlin wrote:
>>
>> On Mon, Mar 11, 2013 at 11:41 AM, Manman Ren <mren at apple.com> wrote:
>>>
>>> Based on discussions with John McCall
>>>
>>> We currently focus on field accesses of structs, more specifically, on
>>> fields that are scalars or structs.
>>>
>>> Fundamental rules from C11
>>> --------------------------
>>> An object shall have its stored value accessed only by an lvalue
>>> expression that has one of the following types: [footnote: The intent of
>>> this list is to specify those circumstances in which an object may or may
>>> not be aliased.]
>>> 1. a type compatible with the effective type of the object,
>>> 2. a qualified version of a type compatible with the effective type of
>>> the object,
>>> 3. a type that is the signed or unsigned type corresponding to the
>>> effective type of the object,
>>> 4. a type that is the signed or unsigned type corresponding to a
>>> qualified version of the effective type of the object,
>>> 5. an aggregate or union type that includes one of the aforementioned
>>> types among its members (including, recursively, a member of a subaggregate
>>> or contained union), or
>>> 6. a character type.
>>>
>>> Example
>>> -------
>>> struct A {
>>> int x;
>>> int y;
>>> };
>>> struct B {
>>> A a;
>>> int z;
>>> };
>>> struct C {
>>> B b1;
>>> B b2;
>>> int *p;
>>> };
>>>
>>> Type DAG:
>>> int <- A::x <- A
>>> int <- A::y <- A <- B::a <- B <- C::b1 <- C
>>> int <----------------- B::z <- B <- C::b2 <- C
>>> any pointer <--------------------- C::p <- C
>>>
>>> The type DAG has two types of TBAA nodes:
>>> 1> the existing scalar nodes
>>> 2> the struct nodes (this is different from the current tbaa.struct)
>>> A struct node has a unique name plus a list of pairs (field name,
>>> field type).
>>> For example, struct node for "C" should look like
>>> !4 = metadata !{"C", "C::b1", metadata !3, "C::b2", metadata !3,
>>> "C::p", metadata !2}
>>> where !3 is the struct node for "B", !2 is pointer type.
>>>
>>> Given a field access
>>> struct B *bp = ...;
>>> bp->a.x = 5;
>>> we annotate it as B::a.x.
>>
>> In the case of multiple structures containing substructures, how are
>> you differentiating?
>>
>> IE given
>>
>> struct A {
>> struct B b;
>> }
>> struct C {
>> struct B b;
>> }
>>
>> How do you know the above struct B *bp =...; is B::b from C and not B::b
>> from A?
>
> If I understand correct, the proposed graph is DAG, not tree, and it should
> be able to tackling the case
> where a type is included more than once.
This was not clear from the proposal, whether the type hierarchy is
duplicated all the way down as necessary, or attempted-to-be-shared by
having types with multiple parents.
>
> On the other hand, the info which is annotated to the memory access is kind
> of "immediate enclosing aggregate type",
> which should be unique.
>
>
>>
>> (I agree you can know in the case of direct aggregates, but I argue
>> you have no way to know in the case of pointer arguments without
>> interprocedural analysis)
>> It gets worse the more levels you have.
>>
>> ie if you add
>> struct B {
>> struct E e;
>> }
>>
>> and have struct E *e = ...
>> how do you know it's the B::e contained in struct C, or the B::e
>> contained in struct A?
>>
>>
>> Again, i agree you can do both scalar and direct aggregates, but not
>> aggregates and scalars through pointers.
>>
>>
> That is exactly the power of TBAA.
I understand quite a bit about TBAA :)
My point was that his check will produce the wrong TBAA answer because
it will have an incomplete path in his representation. He will
declare no-alias, yet it it will be may-alias under the actual TBAA
rules.
> If the both memory accesses are direct
> load/store,
> it dose not even need TBAA, check their base/offset/size is sufficient for
> disambiguation.
False, in a lot of cases. For starters, roughly any union including
aggregate, you will have overlaps that can be disambiguated using TBAA
but not otherwise.
More information about the llvm-dev
mailing list