[llvm-dev] Aliasing rules difference between GCC and Clang

Jonas Paulsson via llvm-dev llvm-dev at lists.llvm.org
Thu Sep 20 08:55:38 PDT 2018


Hi,

I found a difference between Clang and GCC in alias handling. This was 
with a benchmark where Clang was considerably slower, and in a hot 
function which does many more loads from the same address due to stores 
between the uses. In other words, a value is loaded and used, another 
value is stored, and then the first value is loaded once again before 
its second use. This happens many times, with three loads instead of one 
for each value. GCC only emits one load.

The values are the arguments to this function:

void su3_projector( su3_vector *a, su3_vector *b, su3_matrix *c ){
register int i,j;
register double tmp,tmp2;
     for(i=0;i<3;i++)for(j=0;j<3;j++){
         tmp2 = a->c[i].real * b->c[j].real;
         tmp = a->c[i].imag * b->c[j].imag;
         c->e[i][j].real = tmp + tmp2;
         tmp2 = a->c[i].real * b->c[j].imag;
         tmp = a->c[i].imag * b->c[j].real;
         c->e[i][j].imag = tmp - tmp2;
     }
}

The types are:
typedef struct { complex e[3][3]; } su3_matrix;
typedef struct { complex c[3]; } su3_vector;

So the question here is if the su3_vector and su3_matrix pointers may 
alias? If they may alias, then clang is right in reloading after each 
store. If the standard says they cannot alias, then gcc is right in only 
loading the values once each.

It seems to me that either GCC is too aggressive or LLVM is too 
conservative, but I don't know which one it is... As far as I 
understand, there is the fact of the different struct types of the 
arguments (which means they cannot alias), but also the question if 
su3_vector is included in su3_matrix (which would mean they may alias).

I made a reduced test case, where the same difference seems to be 
present. It has just one struct type which contains a matrix of 
double:s. A store to an element of the struct via a pointer is 
surrounded with two loads of a global double variable. Only Clang emits 
two loads.

typedef struct {
   double c[3][3];
} STRUCT_TY;

double e = 0.0;
STRUCT_TY *f;
int g = 0;
void h() {
   int i = e;
   f->c[0][i] = g;
   g = e;
}

clang -O3-march=z13 :

h:                                      # @h
# %bb.0:                                # %entry
         larl    %r1, e
         ld      %f0, 0(%r1)        // LOAD E
         lrl     %r2, g
         cfdbr   %r0, 5, %f0        // CONVERT E
         lgfr    %r0, %r0           // EXTEND E
         cdfbr   %f0, %r2
         lgrl    %r2, f
         sllg    %r3, %r0, 3
         std     %f0, 0(%r3,%r2)    // STORE F ELEMENT
         ld      %f0, 0(%r1)        // 2nd LOAD E <<<<<<<
         cfdbr   %r0, 5, %f0        // CONVERT
         strl    %r0, g             // 2nd USE
         br      %r14

gcc -O3-march=z13 :

h:
.LFB0:
         .cfi_startproc
         larl    %r1,e
         ld      %f0,0(%r1)        // LOAD E
         lrl     %r2,g
         lgrl    %r3,f
         cfdbr   %r1,5,%f0         // CONVERT E
         cdfbr   %f0,%r2
         lgfr    %r2,%r1           // EXTEND E
         sllg    %r2,%r2,3
         std     %f0,0(%r2,%r3)    // STORE F ELEMENT
         strl    %r1,g             // 2nd USE
         br      %r14

I hope somebody with enough experience and knowledge can guide the way 
here as this seems to be quite important.

/Jonas



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180920/600914cd/attachment.html>


More information about the llvm-dev mailing list