<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Ping.<br>
<br>
(add testing cases. Forget attaching testing case in my previous
mail). <br>
<div class="moz-forward-container">Thanks<br>
Shuxin<br>
<br>
<br>
-------- Original Message --------
<table class="moz-email-headers-table" border="0" cellpadding="0"
cellspacing="0">
<tbody>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Subject:
</th>
<td>[Patch] Use address-taken to disambiguate global var and
indirect accesses</td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">Date: </th>
<td>Tue, 15 Oct 2013 14:39:51 -0700</td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">From: </th>
<td>Shuxin Yang <a class="moz-txt-link-rfc2396E" href="mailto:shuxin.llvm@gmail.com"><shuxin.llvm@gmail.com></a></td>
</tr>
<tr>
<th align="RIGHT" nowrap="nowrap" valign="BASELINE">To: </th>
<td>Commit Messages and Patches for LLVM
<a class="moz-txt-link-rfc2396E" href="mailto:llvm-commits@cs.uiuc.edu"><llvm-commits@cs.uiuc.edu></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>Hi,
The attached patch is to take advantage of address-taken to
disambiguate global
variable and indirect memory accesses.
The motivation
===========
I was asked to investigate the problem where the static variable
is not hoisted as
loop invariant:
---------------
static int xyz;
void foo(int *p) {
for (int i = 0; i < xyz; i++)
*p++ = ....
}
-----------------
The compiler dose have a concept call "addr-capture". However, I
don't think it can
be used for disambiguate global variable and indirect access. The
reasons is that
a variable dose not have its address *CAPTURED*, dose not necessarily
mean this variable
cannot be indirectly accessed.
So, I rely on "address taken"
How it works
========
1. In globalopt, when a global var is identified as
not-addr-taken, cache the result
to GlobalVariable::notAddrTaken.
2. In alias-analyzer, supposed the mem-op involved are m1 and m2.
Let o1 and o2
be the "object" (obtained via get_underlying_object() of m1
and m2 respectively.
if O1 != O2 && one of the them are global-variable without
address taken,
then m1 and m2 are disjointed access.
Misc:
=========
Note that I *cache* the result of not-addr-taken. Unlike
function, it is far more expensive
to figure out if a globalvar has its address taken or not. So, it is not
appropriate to analyze
the address-taken on the fly.
On the other hand, I personally think not-addr-taken flag is
almost maintenance free.
(FIXME) Only few optimizer could make a not-addr-taken variable become
addr-taken (say, outlining),
and I don't think currently we have such passes (FIXME again!). In case
such rare cases take place,
it is up to the pass the to reset the not-addr-taken flags.
Of course, a variable previously considered addr-taken may later on
proved to be not-addr-taken.
In that case, compiler dose not have to update it -- it is
conservatively correct.
Performance impact
=============
Measured on an oldish Mac Tower with 2x 2.26Ghz Quad-Core Xeon. Both
base-line and
the change are measured for couple of times. I did take a look of why
Olden/power is sped up --
the loads of static variable "P" and "Q" are promoted in many places. I
have not yet got chance
to investigate why the speedup to pairlocalalign with O3 disappear in
O3+LTO.
o. test-suite w/ O3:
-------------------
Benchmarks/Olden/power/power 1.6129 1.354
-16.0518321036642
Benchmarks/mafft/pairlocalalign 31.4211 26.5794
-15.4090722476298
Benchmarks/Ptrdist/yacr2/yacr2 0.881 0.804
-8.74006810442678
o. test-suite w/ O3 + LTO
-------------------------
Benchmarks/Olden/power/power 1.6143 1.3419 -16.8741869540978
Applications/spiff/spiff 2.9203 2.849 -2.44152997979659
o. spec2kint w/ O3+LTO
----------------------
bzip2 75.02 73.92 -1.4
Thanks
Shuxin
</pre>
<br>
</div>
<br>
</body>
</html>