<div dir="ltr">Oh indeed, sorry, you are doing an outer product on a and b, storing the result in c, and these cannot alias (or they should not, if you do make they alias, it's your responsibility and I think you can get UB).<div>So clang should do better indeed.</div><div><br></div><div>Regards,<br><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr">Le ven. 21 sept. 2018 à 08:21, Jonas Paulsson <<a href="mailto:paulsson@linux.vnet.ibm.com">paulsson@linux.vnet.ibm.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
I would say that GCC is wrong and should also have a version
where a could be equal to b. There is no restrict keyword, so
they could be equal.
<div><br>
</div>
</div>
</blockquote>
This was between a/b and c, not between a and b. Could you explain
your opinion a bit more in detail, please?<br>
<br>
/Jonas<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>Cheers,</div>
<div><br>
</div>
<div>Matthieu</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">Le jeu. 20 sept. 2018 à 16:56, Jonas Paulsson via
llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> a
écrit :<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>Hi,</p>
<p>I found a difference between Clang and GCC in alias
handling. This was with a benchmark where Clang was
considerably slower, and in a hot function which does many
more loads from the same address due to stores between the
uses. In other words, a value is loaded and used, another
value is stored, and then the first value is loaded once
again before its second use. This happens many times, with
three loads instead of one for each value. GCC only emits
one load.</p>
<p>The values are the arguments to this function:<br>
</p>
<tt>void su3_projector( su3_vector *a, su3_vector *b,
su3_matrix *c ){</tt><br>
<tt>register int i,j;</tt><br>
<tt>register double tmp,tmp2;</tt><br>
<tt> for(i=0;i<3;i++)for(j=0;j<3;j++){</tt><br>
<tt> tmp2 = a->c[i].real * b->c[j].real;</tt><br>
<tt> tmp = a->c[i].imag * b->c[j].imag;</tt><br>
<tt> c->e[i][j].real = tmp + tmp2;</tt><br>
<tt> tmp2 = a->c[i].real * b->c[j].imag;</tt><br>
<tt> tmp = a->c[i].imag * b->c[j].real;</tt><br>
<tt> c->e[i][j].imag = tmp - tmp2;</tt><br>
<tt> }</tt><br>
<tt>}</tt><br>
<tt><br>
The types are:</tt><br>
<tt>typedef struct { complex e[3][3]; } su3_matrix;</tt><br>
<tt>typedef struct { complex c[3]; } su3_vector;<br>
</tt><br>
So the question here is if the su3_vector and su3_matrix
pointers may alias? If they may alias, then clang is right
in reloading after each store. If the standard says they
cannot alias, then gcc is right in only loading the values
once each.<br>
<br>
It seems to me that either GCC is too aggressive or LLVM is
too conservative, but I don't know which one it is... As far
as I understand, there is the fact of the different struct
types of the arguments (which means they cannot alias), but
also the question if su3_vector is included in su3_matrix
(which would mean they may alias).<br>
<br>
I made a reduced test case, where the same difference seems
to be present. It has just one struct type which contains a
matrix of double:s. A store to an element of the struct via
a pointer is surrounded with two loads of a global double
variable. Only Clang emits two loads.<br>
<br>
<tt>typedef struct {</tt><tt><br>
</tt><tt> double c[3][3];</tt><tt><br>
</tt><tt>} STRUCT_TY;</tt><tt><br>
</tt><tt><br>
</tt><tt>double e = 0.0;</tt><tt><br>
</tt><tt>STRUCT_TY *f;</tt><tt><br>
</tt><tt>int g = 0;</tt><tt><br>
</tt><tt>void h() {</tt><tt><br>
</tt><tt> int i = e;</tt><tt><br>
</tt><tt> f->c[0][i] = g;</tt><tt><br>
</tt><tt> g = e;</tt><tt><br>
</tt><tt>}</tt><tt><br>
</tt><tt><br>
</tt><tt>clang -O3-march=z13 :</tt><tt><br>
</tt><tt><br>
</tt><tt>h: # @h </tt><tt><br>
</tt><tt># %bb.0: # %entry </tt><tt><br>
</tt><tt> larl %r1, e </tt><tt><br>
</tt><tt> ld %f0, 0(%r1) // LOAD E</tt><tt><br>
</tt><tt> lrl %r2, g </tt><tt><br>
</tt><tt> cfdbr %r0, 5, %f0 // CONVERT E </tt><tt><br>
</tt><tt> lgfr %r0, %r0 // EXTEND E </tt><tt><br>
</tt><tt> cdfbr %f0, %r2 </tt><tt><br>
</tt><tt> lgrl %r2, f </tt><tt><br>
</tt><tt> sllg %r3, %r0, 3 </tt><tt><br>
</tt><tt> std %f0, 0(%r3,%r2) // STORE F EL</tt><tt>EMENT</tt><tt>
</tt><tt><br>
</tt><tt> ld %f0, 0(%r1) // 2nd LOAD E
</tt><tt><<<<<<<<br>
</tt><tt> cfdbr %r0, 5, %f0 // CONVERT </tt><tt><br>
</tt><tt> strl %r0, g // 2nd USE </tt><tt><br>
</tt><tt> br %r14 </tt><tt><br>
</tt><tt><br>
</tt><tt>gcc -O3-march=z13 :</tt><tt><br>
</tt><tt><br>
</tt><tt>h:</tt><tt><br>
</tt><tt>.LFB0:</tt><tt><br>
</tt><tt> .cfi_startproc</tt><tt><br>
</tt><tt> larl %r1,e</tt><tt><br>
</tt><tt> ld %f0,0(%r1)</tt><tt> // LOAD
E</tt><tt><br>
</tt><tt> lrl %r2,g</tt><tt><br>
</tt><tt> lgrl %r3,f</tt><tt><br>
</tt><tt> cfdbr %r1,5,%f0</tt><tt> //
CONVERT E<br>
</tt><tt> cdfbr %f0,%r2</tt><tt><br>
</tt><tt> lgfr %r2,%r1 // EXTEND E</tt><tt><br>
</tt><tt> sllg %r2,%r2,3</tt><tt><br>
</tt><tt> std %f0,0(%r2,%r3) // STORE F
ELEMENT</tt><tt><br>
</tt><tt> strl %r1,g // 2nd USE</tt><tt><br>
</tt><tt> br %r14</tt><tt><br>
<br>
</tt>I hope somebody with enough experience and knowledge
can guide the way here as this seems to be quite important.<br>
<br>
/Jonas<br>
<br>
<br>
<font size="2"><br>
</font> </div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="m_-3343089462936654494gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Quantitative analyst, Ph.D.<br>
Blog: <a href="http://blog.audio-tk.com/" target="_blank">http://blog.audio-tk.com/</a><br>
LinkedIn: <a href="http://www.linkedin.com/in/matthieubrucher" target="_blank">http://www.linkedin.com/in/matthieubrucher</a></div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Quantitative analyst, Ph.D.<br>Blog: <a href="http://blog.audio-tk.com/" target="_blank">http://blog.audio-tk.com/</a><br>LinkedIn: <a href="http://www.linkedin.com/in/matthieubrucher" target="_blank">http://www.linkedin.com/in/matthieubrucher</a></div></div></div></div></div>