[llvm-dev] An ambiguity in TBAA info format
Ivan Kosarev via llvm-dev
llvm-dev at lists.llvm.org
Thu Nov 2 09:12:12 PDT 2017
On 02/11/17 05:41, Hal Finkel wrote:
>
> On 10/31/2017 03:37 AM, Ivan Kosarev wrote:
>
>>
>>>
>>>>
>>>> The tag !1 describes an access to an object of type "A" and !3
>>>> describes an access to object of type "B".
>>>>
>>>> Both the type descriptors, !5 and !7, refer to node !9 as their
>>>> type group. A definition of that node could look like this:
>>>>
>>>> !9 = !{"omnipotent char", ...}
>>>>
>>>> We know that these two accesses should be considered no-alias as
>>>> neither of them encloses the other; the least common type group for
>>>> them is !9. TypeBasedAAResult::Aliases() and
>>>> MDNode::getMostGenericTBAA() respond accordingly and all is good.
>>>>
>>>> Then, let's change the definition for the node !9:
>>>>
>>>> !9 = !{"int", ...}
>>>>
>>>> Now it doesn't look like a type group, but rather a structure
>>>> member. And nodes !5 and !7 now look as descriptors for structure
>>>> types, with their offset fields added during auto-upgrade:
>>>>
>>>> !5 = !{!"A", !9, i64 0}
>>>> !7 = !{!"B", !9, i64 0}
>>>>
>>>> We know that, being interpreted as structure accesses, they still
>>>> should be considered no-alias. However, the least common type group
>>>> for these types is likely to be the "omnipotent char" node, but
>>>> certainly not the type of the field, which is "int".
>>>
>>> I'm not sure what "likely" means in this context.
>>> MDNode::getMostGenericTBAA does not currently seem to have a special
>>> case for the first member of structure types. Instead, it does not
>>> deal with them at all. MDNode::getMostGenericTBAA looks as the
>>> access type, which should be a scalar, and collects the paths from
>>> those types up to the root. Then it returns a scalar access tag for
>>> the type which is common, and most distant from the root, along
>>> those paths.
>>>
>>> Are you trying to extend this to do something else with the
>>> struct/member information on the accesses?
>>
>> Yes, sorry I didn't mention that MDNode::getMostGenericTBAA()
>> currently only considers final access types.
>>
>> This is supposed to get us closer to the support for aggregate
>> accesses, but at this moment I'm trying to fix
>> MDNode::getMostGenericTBAA() to handle struct-path accesses. We do
>> this in TypeBasedAAResult::Aliases() and I see no reasons why we
>> shouldn't do the same on merging of access tags.
>
> Okay. Please describe the algorithm you'd like to implement.
>
> If you have:
>
> !0 = {!"root"}
> !1 = {!"char", !0, i64 0 }
> !2 = {!"Inner", !1, i64 0 } // is this a struct with one field or a type?
> !3 = {!"Outer", !2 ,i64 0 }
>
> // and two access tags:
> !8 = {!2, !1, i64 0}
> !9 = {!3, !1, i64 0}
>
> Currently, if we ask for the most-generic TBAA, we look at the access
> types only, and so in this case trivially return:
> !10 - {!1, !1, i64 0}
>
> But you want to return !8, because by walking up from !9 through the
> base type !3, adjusting for the offset (which is trivial in this
> case), we arrive at !2 @ offset 0. That's the same place in the graph
> as !8 (which is !2 at offset 0), and so that's the common access tag.
>
> This seems to work without ambiguity, so I'm not seeing the problem.
Yes, that is what I wanted. I just uploaded a patch that (hopefully)
does exactly what you described:
https://reviews.llvm.org/D39557
I agree we can do that as long as all final access types are scalar, but
it wouldn't work if we want to support aggregate accesses. As you
suggested, I will address this issue in the parallel thread.
Thanks,
>
> -Hal
>
>> Another goal is to use the same function for matching couples of tags
>> and finding most generic tags for them. Here's what it would look like:
>>
>> https://reviews.llvm.org/differential/diff/120871/
>>
>> But for this we need to able to distinct members from parent types.
>>
>> Thanks,
>>
>>
>>>
>>> -Hal
>>>
>>>>
>>>> The problem is that since the formats for the member-of-structure
>>>> and member-of-type-group relationships match,
>>>> MDNode::getMostGenericTBAA() cannot disambiguate between them and
>>>> always treat first members of structure types as type groups.
>>>>
>>>> To resolve this issue I'm thinking of changing the format of type
>>>> nodes so that all of them, except root ones, refer to their type
>>>> groups with their first operand. The scalar types "A" and "B"
>>>> mentioned above would then be rewritten as:
>>>>
>>>> !5 = !{!9, !"A"}
>>>> !7 = !{!9, !"B"}
>>>>
>>>> !9 = !{..., "omnipotent char"}
>>>>
>>>> and their structure versions would read:
>>>>
>>>> !5 = !{!9, !"A", !11, i64 0}
>>>> !7 = !{!9, !"B", !11, i64 0}
>>>>
>>>> !11 = !{!9, "int"}
>>>>
>>>> The new format can be easily recognized by considering the type of
>>>> the first operand: a string would mean the old format and a
>>>> metadata node would suggest the new convention.
>>>>
>>>> The question to the community is, are there any reasons that
>>>> wouldn't work or not desirable? Or, are there better alternatives
>>>> to the proposed solution?
>>>>
>>>> As usual, any comments are highly appreciated.
>>>>
>>>> Thanks,
>>>>
>>>
>>
>
--
More information about the llvm-dev
mailing list