[llvm-dev] Aliasing rules difference between GCC and Clang
Jonas Paulsson via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 20 08:55:38 PDT 2018
Hi,
I found a difference between Clang and GCC in alias handling. This was
with a benchmark where Clang was considerably slower, and in a hot
function which does many more loads from the same address due to stores
between the uses. In other words, a value is loaded and used, another
value is stored, and then the first value is loaded once again before
its second use. This happens many times, with three loads instead of one
for each value. GCC only emits one load.
The values are the arguments to this function:
void su3_projector( su3_vector *a, su3_vector *b, su3_matrix *c ){
register int i,j;
register double tmp,tmp2;
for(i=0;i<3;i++)for(j=0;j<3;j++){
tmp2 = a->c[i].real * b->c[j].real;
tmp = a->c[i].imag * b->c[j].imag;
c->e[i][j].real = tmp + tmp2;
tmp2 = a->c[i].real * b->c[j].imag;
tmp = a->c[i].imag * b->c[j].real;
c->e[i][j].imag = tmp - tmp2;
}
}
The types are:
typedef struct { complex e[3][3]; } su3_matrix;
typedef struct { complex c[3]; } su3_vector;
So the question here is if the su3_vector and su3_matrix pointers may
alias? If they may alias, then clang is right in reloading after each
store. If the standard says they cannot alias, then gcc is right in only
loading the values once each.
It seems to me that either GCC is too aggressive or LLVM is too
conservative, but I don't know which one it is... As far as I
understand, there is the fact of the different struct types of the
arguments (which means they cannot alias), but also the question if
su3_vector is included in su3_matrix (which would mean they may alias).
I made a reduced test case, where the same difference seems to be
present. It has just one struct type which contains a matrix of
double:s. A store to an element of the struct via a pointer is
surrounded with two loads of a global double variable. Only Clang emits
two loads.
typedef struct {
double c[3][3];
} STRUCT_TY;
double e = 0.0;
STRUCT_TY *f;
int g = 0;
void h() {
int i = e;
f->c[0][i] = g;
g = e;
}
clang -O3-march=z13 :
h: # @h
# %bb.0: # %entry
larl %r1, e
ld %f0, 0(%r1) // LOAD E
lrl %r2, g
cfdbr %r0, 5, %f0 // CONVERT E
lgfr %r0, %r0 // EXTEND E
cdfbr %f0, %r2
lgrl %r2, f
sllg %r3, %r0, 3
std %f0, 0(%r3,%r2) // STORE F ELEMENT
ld %f0, 0(%r1) // 2nd LOAD E <<<<<<<
cfdbr %r0, 5, %f0 // CONVERT
strl %r0, g // 2nd USE
br %r14
gcc -O3-march=z13 :
h:
.LFB0:
.cfi_startproc
larl %r1,e
ld %f0,0(%r1) // LOAD E
lrl %r2,g
lgrl %r3,f
cfdbr %r1,5,%f0 // CONVERT E
cdfbr %f0,%r2
lgfr %r2,%r1 // EXTEND E
sllg %r2,%r2,3
std %f0,0(%r2,%r3) // STORE F ELEMENT
strl %r1,g // 2nd USE
br %r14
I hope somebody with enough experience and knowledge can guide the way
here as this seems to be quite important.
/Jonas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180920/600914cd/attachment.html>
More information about the llvm-dev
mailing list