[llvm-dev] RFC: Representing unions in TBAA

Steven Perron via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 28 16:44:06 PST 2017

Seems like the comments have stopped.  I'll try to get a patch together. 
Then we can continue the discussion from there. 

Hubert, as for the issue with the llvm optimizations losing the TBAA 
information, it should be the responsibility to make sure the aliasing is 
changed in the correct way.  One function related to this has already been 
mentioned:  getMostGenericTBAA.

Steven Perron

From:   Hubert Tong <hubert.reinterpretcast at gmail.com>
To:     Steven Perron/Toronto/IBM at IBMCA
Cc:     Daniel Berlin <dberlin at dberlin.org>, llvm-dev 
<llvm-dev at lists.llvm.org>, Sanjoy Das <sanjoy at playingwithpointers.com>
Date:   2017/02/15 07:44 AM
Subject:        Re: [llvm-dev] RFC: Representing unions in TBAA

On Tue, Feb 14, 2017 at 11:22 PM, Steven Perron <perrons at ca.ibm.com> 
3) How should we handle a reference directly through a union, and a 
reference that is not through the union?  

My solution was to look for each member of the union overlaps the given 
offset, and see if any of those members aliased the other reference.  If 
no member aliases the other reference, then the answer is no alias.  
Otherwise the answer is may alias.  The run time for this would be 
proportional to  "distance to the root" * "number of overlapping 
members".  This could be slow if there are unions with many members or 
many unions of unions.

Another option is to say that they do not alias.  This would mean that all 
references to unions must be explicitly through the union.
>From what I gather from the thread so far, the access through the union 
may be lost because of LLVM transformations. I am not sure how, in the 
face of that, TBAA could indicate NoAlias safely (without the risk of 
functional-correctness issues in correct programs) between types which 
overlap within any union (within some portion of the program).

As for the standards, it is definitely not true that all references to 
unions must be explicitly through the union. However, if you are trying to 
perform union-based type punning (under C11), then it appears that it is 
intended that the read must be through the union.
This would be the least restrictive aliasing allowing the most 
optimization.  The implementation would be simple.  I believe we make the 
parent of the TBAA node for the union to be "omnipotent char".  This might 
be similar to treating the union TBAA node more like a scalar node instead 
of a struct-path.  Then the traversal of the TBAA nodes will be quick.  
I'll have to work this out a bit more, but, if this is good enough to meet 
the requirements of the standard, I can try to think this through a little 
more.  I'll need Hubert and Daniel to comment on that since I am no expert 
on the C and C++ standards.

The third option is to be pessimistic and say "may alias" all of the time 
(conservatively correct), and rely on other alias analysis to improve it.  
This will have good compile time, but could hinder optimization.  
Personally I do not like this option.  Most of the time it will not have a 
negative effect, but there will be a reasonable number of programs where 
this will hurt optimization more that it needs to.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170228/cc0261da/attachment.html>

More information about the llvm-dev mailing list