[LLVMdev] PROPOSAL: struct-access-path aware TBAA

Wed Mar 13 11:37:08 PDT 2013

On Mar 13, 2013, at 1:07 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:

>> 
>> The program I gave was well typed :)
> 
> Hi, Daniel:
>   Thank you for sharing your insight.  I didn't realized it is well-typed -- I'm basically a big nut of any std.
> I'd admit std/spec is one of the most boring material on this planet:-).
> 
>   So, if I understand correct, your point is:
>       if a std call a type-casting (could be one which is in bad-taste:-), TBAA has to respect such std.
> 
>  If that is strictly true, TBAA has to reply on point-to analysis. However, that would virtually disable
> TBAA as most point-to set has "unknown" element.
> 
>   Going back to my previous mail,
>> In the below example, GCC assumes p and q point to anything because
>> they are incoming arguments.
>> 
>>> 
>>> ------------------------------
>>> typedef struct {
>>>     int x;
>>> }T1;
>>> 
>>> typedef struct {
>>>     int y;
>>> }T2;
>>> 
>>> int foo(T1 *p, T2 *q) {
>>>     p->x = 1;
>>>     q->y = 4;
>>>     return p->x;
>>> }
>>> --------------------------
> Yes, gcc should assume p and q point to anything, however, the result contradict to the assumption --
> It promote the p->x expression.

Assuming above is C11 code, I think the relevant section in the C spec is the following:

This is a paragraph from a C11 draft ("N1570 Committee Draft — April 12, 2011") . Assuming my interpretation of it is correct: It seems to imply that a store to an lvalue can change its subsequent effective type? This would preclude any purely based TBAA solution. And would, in general, require to take access/points-to information into account.

---
6.5 Expressions

6: "The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access."
---

This is just before paragraph 6.5 Expressions 7 that is quoted in the current TBAA proposal.    

 "If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that <<do not modify>> the stored value."

I read this as "A store will set the "effective type" for any subsequent read access" on the same object. So, in the above example, assuming that p and q point to the same object, the effective type is changed from the first to the second line. Which means that IF p and q pointed to the same object the read access to "p->x" using the old effective type is undefined. Hence, we may assume that p and q don't point to the same object.

I don't know whether that reasoning underlies the decision that GCC makes but it would be a justification (assuming my reasoning above is correct).

WRT to the current TBAA proposal this means that we have to be aware if we decide on a purely type/access path based solution we might be breaking a lot more code than we do now.

Best,
Arnold