<div dir="ltr"><div><div><div><div>Hi Daniel,<br><br>Sorry for taking so long to
respond. I spoke with a colleague more familiar with llvm who thought
he could clear up my confusion, but we both came out of the conversation
confused. I will try my best to explain the ambiguity.<br>
<br>In an DAG, alias queries would be completely unambiguous.
Every instruction would only be executed once, and every SSA value
really would have a single static assignment. Any instruction would
have a single value per execution, and when you ask alias(x,y) it would
be a question of whether those single memory addresses overlapped.<br>
<br>However, once backedges are introduced, and instruction can be
executed multiple times. Every time the execution passes over a
backedge, any instructions that are executed again, may have completely
different values. Now, when you ask alias(x,y), it's ambiguous which
pairs of addresses that query considers. <br>
<br>In some cases, it seems clear. For example, in the loops we have
been talking about, it seems like you are saying that it only compares
the values in the same iteration, but does not say anything about
aliasing between iterations. These are perfectly reasonable semantics
for loops, but it does not define what the alias queries mean in general. For
example, consider the following CFG:<br><img src="cid:ii_hz3dhld90_147f60b11c33257c" height="563" width="432"><br></div><div><br></div>If we ask a query alias(x,y) about instructions in basic blocks B and C respectively, what will those answers mean? There is always a backedge between their executions so are they considered to be different iterations and therefore they never alias?<br>
<br></div><div>I get that there may be some way of understanding the alias queries in which "dynamic instructions" do not matter to the answer, but whatever that way is, it must also be possible to use that way to explain things in terms of the "dynamic instructions" that the CPU ends up executing. And at least for me that would resolve any confusion I have about the AliasAnalysis.<br>
<br></div><div>Thank you again for your time,<br></div><div>Jeremy<br>
</div></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Aug 14, 2014 at 10:46 AM, Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Let me take your previous example in an attempt to hopefully explain why "dynamic instructions" do not matter to the answer:<br>
You asked about:<div class=""><br><div><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">> std::vector<int> A(100);</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">
<span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>> int* x,y;</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px"><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>> x=&A[0];</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">
<span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>></span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px"><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>> for(int i=0; i<100; i++) {</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">
<span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>> y=&A[i];</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px"><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>> *y=*x;</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">
<span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>> x=&A[i+1];</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px"><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">>> }</span><br style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">
<span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px">></span><br></div><div><br></div></div><div>This is not what it looks like in LLVM.</div><div><br></div><div>In LLVM, it looks like this:<br>
</div>
<div><br></div><div><div class=""><blockquote class="gmail_quote" style="font-family:arial,sans-serif;font-size:13px;margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div><div>std::vector<int> A(100); <br></div><div>int* x,y;</div></div></blockquote><blockquote class="gmail_quote" style="font-family:arial,sans-serif;font-size:13px;margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div><div><span style="font-family:arial;font-size:small;color:rgb(34,34,34)"> </span></div></div></blockquote></div><blockquote class="gmail_quote" style="font-family:arial,sans-serif;font-size:13px;margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div><div></div></div><div>x_1=GEP A, 0, 0</div></blockquote><div class=""><blockquote class="gmail_quote" style="font-family:arial,sans-serif;font-size:13px;margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div><span style="color:rgb(80,0,80)">for(int i=0; i<100; i++) {</span></div></blockquote></div><div> i_2 = phi (0, i_3) </div><div> x_2 = phi(x_1, x_3)</div><div> y_1 = GEP A, 0, i_2</div><div> temp = load x_2</div>
<div> store y_1, temp</div><div> temp2 = add i_2, 1</div><div> x_3 = GEP A, 0, temp2</div><div> i_3 = add i_2, 1</div><div>}</div></div><div><br></div><div>As you can see, every time you redefine the value of the pointer x to a new value, it gets a new name.</div>
<div><br></div><div>alias(x_1, y_1) = mayalias</div><div>alias(x_2, y_1) = mustalias</div><div>alias(x_3, y_1) = mayalias</div><div><br></div><div><br></div><div>The only interesting question, IMHO, is whether alias(x_3,y_1) is MayAlias or NoAlias. </div>
<div>x_2 clearly has the same value as y_1 at all times throughout the program.</div><div><br></div><div> </div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Aug 14, 2014 at 6:54 AM, Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>On Thu, Aug 14, 2014 at 6:37 AM, Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>> wrote:<br>
> On Wed, Aug 13, 2014 at 8:35 PM, Jeremy Salwen <<a href="mailto:jeremysalwen@gmail.com" target="_blank">jeremysalwen@gmail.com</a>> wrote:<br>
>> Hey Daniel,<br>
>><br>
>> Thanks again for the help. I'm still a bit confused about the interface to<br>
>> the alias analysis. It seems like we are talking about different<br>
>> interfaces.<br>
> Sorry, yes, i was thinking of a different interface.<br>
><br>
>> Has it changed from what the documentation says? As far as I<br>
>> can tell, the documentation takes a specific Value*, and no information<br>
>> about which dynamic execution it is talking about.<br>
><br>
> The actual standard low level interface is:<br>
><br>
> virtual AliasResult alias(const Location &LocA, const Location &LocB);<br>
> (There is a value wrapper for this)<br>
><br>
> I think part of the confusion here is that you keep redefining<br>
> pointers and expecting the results of queries to change.<br>
><br>
> Outside of globals, LLVM is in SSA form. Thus, these are queries will<br>
> end up being about specific pointers that will only be defined once.<br>
> In all of your examples, you have asked about local variables. These<br>
> local variables will be put in SSA. Every time you redefine the<br>
> pointer value, it will generate a new definition that is separate from<br>
> the old ones.<br>
> If you transform your examples above into globals, you may get<br>
> different answers.<br>
><br>
><br>
><br>
>><br>
>> When you say "Right. It's a query about specific loads and stores." it<br>
>> sounds like you are saying that it does consider dynamic executions, and<br>
>> that you can make queries about specific dynamic executions. I can't seem<br>
>> to find this option anywhere in the API..<br>
> Yes, sorry, I was thinking of the MemoryDependence interface, which<br>
> does give you that information.<br>
> AliasAnalysis does not.<br>
><br>
>><br>
>> Perhaps someone else could clear the confusion, because I feel like there is<br>
>> a gap of communication.<br>
><br>
><br>
>><br>
>> For example, in this code what would the result be for the first store and<br>
>> the first load?<br>
><br>
>><br>
>>> #include <cstdio><br>
>>> #include <cstdlib><br>
>>> int main(int c, char** argv) {<br>
>>> int* A=(int*)malloc(100*sizeof(int));<br>
>>><br>
>>> if(c>1) {<br>
>>> for(int i=0; i<5; i++) {<br>
>>> A[i]=5;<br>
>>> }<br>
>>> } else {<br>
>>> A[1]=5;<br>
>>> }<br>
>>> printf("%d\n",A[4]);<br>
>>> free(A);<br>
>>> }<br>
>><br>
>><br>
><br>
>> MustAlias? Because whenever they are executed, they do alias. Or MayAlias?<br>
> Since you seem to be having trouble going all the way down the rabbit<br>
> hole here, let me take a different approach.<br>
><br>
> In this example, A[i] is not an LLVM location or value *.<br>
> You will never see A[i] in the LLVM IR to query about.<br>
> You will see two pointers as this:<br>
> %i.0 = phi i32 [ 0, %4 ], [ %11, %10 ]<br>
> %8 = sext i32 %i.0 to i64<br>
> %9 = getelementptr inbounds i32* %2, i64 %8<br>
> store i32 5, i32* %9, align 4<br>
><br>
> and<br>
> %16 = getelementptr inbounds i32* %2, i64 4<br>
> %17 = load i32* %16, align 4<br>
><br>
> The query will be alias(%9, %16)<br>
><br>
<br>
<br>
> In this case, it cannot prove it is MustAlias, but it is actually MustAlias.<br>
> It will, however, return MayAlias because it is not powerful enough to prove it.<br>
<br>
</div></div>Sigh, too early in the morning, pasted the right output, wrote the<br>
answer while looking at the wrong pass output.<br>
These two pointers will always be MayAlias.<br>
LLVM later actually deletes the loop, and transforms the above into a<br>
memset + a load from memset, and ends up with two mustalias pointers<br>
in that case.<br>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>