<div dir="ltr">Ah.<div>IMHO, yes we should disable it until it's correct.<div><div><br></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Apr 7, 2017 at 1:28 PM, Krzysztof Parzyszek <span dir="ltr"><<a href="mailto:kparzysz@codeaurora.org" target="_blank">kparzysz@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I'm asking if people are ok with it. :)<br>
We've had customer reports that can be tracked down to this issue, so this is something we'd really like to working (at least in terms of correctness).<br>
<br>
I can come up with a patch.<br>
<br>
-Krzysztof<span class=""><br>
<br>
<br>
On 4/7/2017 3:25 PM, Daniel Berlin wrote:<br>
</span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
Not familiar with clang enough to know.<br>
Staring at it, it looks a bit annoying to do.<br>
You'd basically just want to stop decorating loads/stores with tbaa info<br>
if it's a union.<br>
<br>
<br>
On Fri, Apr 7, 2017 at 12:09 PM, Krzysztof Parzyszek via llvm-dev<br></span><div><div class="h5">
<<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>>> wrote:<br>
<br>
Can we turn off TBAA info for union member accesses in clang before<br>
this gets fixed?<br>
<br>
-Krzysztof<br>
<br>
<br>
On 3/1/2017 5:30 PM, Daniel Berlin via llvm-dev wrote:<br>
<br>
So, <a href="https://bugs.llvm.org/show_bug.cgi?id=32056" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug<wbr>.cgi?id=32056</a><br>
<<a href="https://bugs.llvm.org/show_bug.cgi?id=32056" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bu<wbr>g.cgi?id=32056</a>> is an example showing<br>
our current TBAA tree for union generation is definitely<br>
irretrievably<br>
broken.<br>
I'll be honest here. I'm pretty sure your proposal doesn't go<br>
far enough.<br>
But truthfully, I would rather see us come closer to a<br>
representation<br>
we know works, which is GCC's.<br>
Let me try to simplify what you are suggesting, and what we have.<br>
Our current representation is basically inverted from GCC, but<br>
we don't<br>
allow things that would enable it to work.<br>
<br>
Given<br>
union {int a, short b};<br>
<br>
GCC's will be:<br>
<br>
union<br>
/ \<br>
short int<br>
<br>
<br>
Everything is implicitly a subset of alias set 0 (which for C/C++ is<br>
char) just to avoid representing it.<br>
<br>
Our metadata has no child links, and only a single parent link.<br>
<br>
You can't represent the above form because you can't make a<br>
single short<br>
node a child of every union/struct it needs to be (lack of multiple<br>
parents means you can't just frob them all to offset zero).<br>
<br>
Your suggestion is to invert this as a struct<br>
<br>
short int<br>
| /<br>
union<br>
<br>
We don't allow multiple parents, however.<br>
Because of that, you suggest we follow all nodes for unions, special<br>
casing union-type nodes somehow<br>
<br>
Let me suggest something different:<br>
<br>
We know the current structure fails us in a number of ways.<br>
Instead of trying to shoehorn this into our current structure, I<br>
suggest: we stop messing around and just have a GCC style tree, and<br>
represent the children instead of the parents.<br>
We make contained types descendants instead of ancestors.<br>
We allow multiple children at each offset and for scalar type<br>
nodes.x`<br>
<br>
We could do that without changing to children, but honestly, i<br>
strongly<br>
believe nobody who ever looks at this really understands it<br>
right now.<br>
(If someone really does like the ancestor form, i'd love to<br>
understand<br>
why :P)<br>
<br>
So if we are going to change it, we should change it to<br>
something that<br>
is easier to understand.<br>
<br>
something like:<br>
<br>
scalar type node -> {name, children nodes}<br>
struct node -> {name, array of {offset, child node} }<br>
<br>
Paths are separate from the tbaa tree itself, and are just:<br>
path node -> {struct node, offset} or whatever.<br>
<br>
unions are just scalar type nodes with multiple children, not struct<br>
nodes with special-cased offset zero.<br>
<br>
This also has a well-defined upgrade path.<br>
<br>
For "old-style" DAGs, like exist now, we know we have to regen the<br>
metadata, and that it is wrong (So we could, for example, simply<br>
disable<br>
it for correctness reasons)<br>
.<br>
Anything with a "new-style" DAG, we know is usable.<br>
<br>
In this representation, the most generic tbaa node is just the<br>
one that<br>
contains the other.<br>
If neither contains the other, you can just make one that has<br>
both as<br>
children and use that.<br>
(instead of now, where you'd have to have multiple parents again).<br>
<br>
(The above also may be better for other languages)<br>
<br>
--Dan<br>
<br>
<br>
<br>
<br>
On Tue, Feb 28, 2017 at 4:44 PM, Steven Perron<br>
<<a href="mailto:perrons@ca.ibm.com" target="_blank">perrons@ca.ibm.com</a> <mailto:<a href="mailto:perrons@ca.ibm.com" target="_blank">perrons@ca.ibm.com</a>><br></div></div><span class="">
<mailto:<a href="mailto:perrons@ca.ibm.com" target="_blank">perrons@ca.ibm.com</a> <mailto:<a href="mailto:perrons@ca.ibm.com" target="_blank">perrons@ca.ibm.com</a>>>> wrote:<br>
<br>
Seems like the comments have stopped. I'll try to get a patch<br>
together. Then we can continue the discussion from there.<br>
<br>
Hubert, as for the issue with the llvm optimizations losing<br>
the TBAA<br>
information, it should be the responsibility to make sure the<br>
aliasing is changed in the correct way. One function related to<br>
this has already been mentioned: getMostGenericTBAA.<br>
<br>
Later,<br>
Steven Perron<br>
<br>
<br>
<br>
From: Hubert Tong <<a href="mailto:hubert.reinterpretcast@gmail.com" target="_blank">hubert.reinterpretcast@gmail.<wbr>com</a><br>
<mailto:<a href="mailto:hubert.reinterpretcast@gmail.com" target="_blank">hubert.reinterpretcast<wbr>@gmail.com</a>><br></span>
<mailto:<a href="mailto:hubert.reinterpretcast@gmail.com" target="_blank">hubert.reinterpretcast<wbr>@gmail.com</a><span class=""><br>
<mailto:<a href="mailto:hubert.reinterpretcast@gmail.com" target="_blank">hubert.reinterpretcast<wbr>@gmail.com</a>>>><br>
To: Steven Perron/Toronto/IBM@IBMCA<br>
Cc: Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a><br>
<mailto:<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>><br></span>
<mailto:<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a> <mailto:<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>>>><wbr>,<br>
llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>><br>
<mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a><span class=""><br>
<mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>>>>, Sanjoy Das<br>
<<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.co<wbr>m</a><br>
<mailto:<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpoin<wbr>ters.com</a>><br></span>
<mailto:<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpoin<wbr>ters.com</a><div><div class="h5"><br>
<mailto:<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpoin<wbr>ters.com</a>>>><br>
Date: 2017/02/15 07:44 AM<br>
Subject: Re: [llvm-dev] RFC: Representing unions in TBAA<br>
<br>
------------------------------<wbr>------------------------------<wbr>------------<br>
<br>
<br>
<br>
On Tue, Feb 14, 2017 at 11:22 PM, Steven Perron<br>
<_perrons@ca.ibm.com_ <mailto:<a href="mailto:perrons@ca.ibm.com" target="_blank">perrons@ca.ibm.com</a><br>
<mailto:<a href="mailto:perrons@ca.ibm.com" target="_blank">perrons@ca.ibm.com</a>>>> wrote:<br>
3) How should we handle a reference directly through a<br>
union, and a<br>
reference that is not through the union?<br>
<br>
My solution was to look for each member of the union<br>
overlaps the<br>
given offset, and see if any of those members aliased the other<br>
reference. If no member aliases the other reference, then the<br>
answer is no alias. Otherwise the answer is may alias. The run<br>
time for this would be proportional to "distance to the root" *<br>
"number of overlapping members". This could be slow if<br>
there are<br>
unions with many members or many unions of unions.<br>
<br>
Another option is to say that they do not alias. This would<br>
mean<br>
that all references to unions must be explicitly through the<br>
union.<br>
From what I gather from the thread so far, the access<br>
through the<br>
union may be lost because of LLVM transformations. I am not sure<br>
how, in the face of that, TBAA could indicate NoAlias safely<br>
(without the risk of functional-correctness issues in correct<br>
programs) between types which overlap within any union<br>
(within some<br>
portion of the program).<br>
<br>
As for the standards, it is definitely not true that all<br>
references<br>
to unions must be explicitly through the union. However, if<br>
you are<br>
trying to perform union-based type punning (under C11), then it<br>
appears that it is intended that the read must be through<br>
the union.<br>
<br>
This would be the least restrictive aliasing allowing the most<br>
optimization. The implementation would be simple. I believe we<br>
make the parent of the TBAA node for the union to be "omnipotent<br>
char". This might be similar to treating the union TBAA<br>
node more<br>
like a scalar node instead of a struct-path. Then the<br>
traversal of<br>
the TBAA nodes will be quick. I'll have to work this out a bit<br>
more, but, if this is good enough to meet the requirements<br>
of the<br>
standard, I can try to think this through a little more.<br>
I'll need<br>
Hubert and Daniel to comment on that since I am no expert on<br>
the C<br>
and C++ standards.<br>
<br>
The third option is to be pessimistic and say "may alias"<br>
all of the<br>
time (conservatively correct), and rely on other alias<br>
analysis to<br>
improve it. This will have good compile time, but could hinder<br>
optimization. Personally I do not like this option. Most<br>
of the<br>
time it will not have a negative effect, but there will be a<br>
reasonable number of programs where this will hurt<br>
optimization more<br>
that it needs to.<br>
<br>
<br>
<br>
<br>
<br>
<br>
______________________________<wbr>_________________<br>
LLVM Developers mailing list<br></div></div>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><span class=""><br>
<<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin<wbr>/mailman/listinfo/llvm-dev</a>><br>
<br>
<br>
--<br>
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,<br>
hosted by The Linux Foundation<br>
<br>
______________________________<wbr>_________________<br>
LLVM Developers mailing list<br></span>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin<wbr>/mailman/listinfo/llvm-dev</a>><br>
<br>
<br>
</blockquote><div class="HOEnZb"><div class="h5">
<br>
-- <br>
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation<br>
</div></div></blockquote></div><br></div>