<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi Matthieu,</p>
    <p><br>
    </p>
    <br>
    <blockquote type="cite"
cite="mid:CAHCaCkK8fb+hdzMh7Zr_5ock+QVyJ3t+EfMEdBJRkfqD5Ld1ag@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=utf-8">
      <div dir="ltr">Oh indeed, sorry, you are doing an outer product on
        a and b, storing the result in c, and these cannot alias (or
        they should not, if you do make they alias, it's your
        responsibility and I think you can get UB).
        <div>So clang should do better indeed.</div>
        <div><br>
        </div>
      </div>
    </blockquote>
    It would be very nice if you could motivate this in detail with
    references to the standard. Is it enough that the two struct types
    have different tags? It might possibly be argued that the matrix
    contains the vector struct type, and therefore they may alias.<br>
     Why do you think this is not the case in this example?<br>
    <br>
    /Jonas<br>
    <br>
    <br>
    <br>
    <blockquote type="cite"
cite="mid:CAHCaCkK8fb+hdzMh7Zr_5ock+QVyJ3t+EfMEdBJRkfqD5Ld1ag@mail.gmail.com">
      <div dir="ltr">
        <div>Regards,<br>
          <div><br>
          </div>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr">Le ven. 21 sept. 2018 à 08:21, Jonas Paulsson
          <<a href="mailto:paulsson@linux.vnet.ibm.com"
            moz-do-not-send="true">paulsson@linux.vnet.ibm.com</a>> a
          écrit :<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div text="#000000" bgcolor="#FFFFFF"> <br>
            <blockquote type="cite">
              <div dir="ltr">
                <div><br>
                </div>
                I would say that GCC is wrong and should also have a
                version where a could be equal to b. There is no
                restrict keyword, so they could be equal.
                <div><br>
                </div>
              </div>
            </blockquote>
            This was between a/b and c, not between a and b. Could you
            explain your opinion a bit more in detail, please?<br>
            <br>
            /Jonas<br>
            <br>
            <blockquote type="cite">
              <div dir="ltr">
                <div>Cheers,</div>
                <div><br>
                </div>
                <div>Matthieu</div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr">Le jeu. 20 sept. 2018 à 16:56, Jonas
                  Paulsson via llvm-dev <<a
                    href="mailto:llvm-dev@lists.llvm.org"
                    target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
                  a écrit :<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div text="#000000" bgcolor="#FFFFFF">
                    <p>Hi,</p>
                    <p>I found a difference between Clang and GCC in
                      alias handling. This was with a benchmark where
                      Clang was considerably slower, and in a hot
                      function which does many more loads from the same
                      address due to stores between the uses. In other
                      words, a value is loaded and used, another value
                      is stored, and then the first value is loaded once
                      again before its second use. This happens many
                      times, with three loads instead of one for each
                      value. GCC only emits one load.</p>
                    <p>The values are the arguments to this function:<br>
                    </p>
                    <tt>void su3_projector( su3_vector *a, su3_vector
                      *b, su3_matrix *c ){</tt><br>
                    <tt>register int i,j;</tt><br>
                    <tt>register double tmp,tmp2;</tt><br>
                    <tt>    for(i=0;i<3;i++)for(j=0;j<3;j++){</tt><br>
                    <tt>        tmp2 = a->c[i].real *
                      b->c[j].real;</tt><br>
                    <tt>        tmp = a->c[i].imag * b->c[j].imag;</tt><br>
                    <tt>        c->e[i][j].real = tmp + tmp2;</tt><br>
                    <tt>        tmp2 = a->c[i].real *
                      b->c[j].imag;</tt><br>
                    <tt>        tmp = a->c[i].imag * b->c[j].real;</tt><br>
                    <tt>        c->e[i][j].imag = tmp - tmp2;</tt><br>
                    <tt>    }</tt><br>
                    <tt>}</tt><br>
                    <tt><br>
                      The types are:</tt><br>
                    <tt>typedef struct { complex e[3][3]; } su3_matrix;</tt><br>
                    <tt>typedef struct { complex c[3]; } su3_vector;<br>
                    </tt><br>
                    So the question here is if the su3_vector and
                    su3_matrix pointers may alias? If they may alias,
                    then clang is right in reloading after each store.
                    If the standard says they cannot alias, then gcc is
                    right in only loading the values once each.<br>
                    <br>
                    It seems to me that either GCC is too aggressive or
                    LLVM is too conservative, but I don't know which one
                    it is... As far as I understand, there is the fact
                    of the different struct types of the arguments
                    (which means they cannot alias), but also the
                    question if su3_vector is included in su3_matrix
                    (which would mean they may alias).<br>
                    <br>
                    I made a reduced test case, where the same
                    difference seems to be present. It has just one
                    struct type which contains a matrix of double:s. A
                    store to an element of the struct via a pointer is
                    surrounded with two loads of a global double
                    variable. Only Clang emits two loads.<br>
                    <br>
                    <tt>typedef struct {</tt><tt><br>
                    </tt><tt>  double c[3][3];</tt><tt><br>
                    </tt><tt>} STRUCT_TY;</tt><tt><br>
                    </tt><tt><br>
                    </tt><tt>double e = 0.0;</tt><tt><br>
                    </tt><tt>STRUCT_TY *f;</tt><tt><br>
                    </tt><tt>int g = 0;</tt><tt><br>
                    </tt><tt>void h() {</tt><tt><br>
                    </tt><tt>  int i = e;</tt><tt><br>
                    </tt><tt>  f->c[0][i] = g;</tt><tt><br>
                    </tt><tt>  g = e;</tt><tt><br>
                    </tt><tt>}</tt><tt><br>
                    </tt><tt><br>
                    </tt><tt>clang -O3-march=z13 :</tt><tt><br>
                    </tt><tt><br>
                    </tt><tt>h:                                      #
                      @h </tt><tt><br>
                    </tt><tt># %bb.0:                                #
                      %entry </tt><tt><br>
                    </tt><tt>        larl    %r1, e </tt><tt><br>
                    </tt><tt>        ld      %f0, 0(%r1)        // LOAD
                      E</tt><tt><br>
                    </tt><tt>        lrl     %r2, g   </tt><tt><br>
                    </tt><tt>        cfdbr   %r0, 5, %f0        //
                      CONVERT E </tt><tt><br>
                    </tt><tt>        lgfr    %r0, %r0           //
                      EXTEND E  </tt><tt><br>
                    </tt><tt>        cdfbr   %f0, %r2  </tt><tt><br>
                    </tt><tt>        lgrl    %r2, f   </tt><tt><br>
                    </tt><tt>        sllg    %r3, %r0, 3 </tt><tt><br>
                    </tt><tt>        std     %f0, 0(%r3,%r2)    // STORE
                      F EL</tt><tt>EMENT</tt><tt> </tt><tt><br>
                    </tt><tt>        ld      %f0, 0(%r1)        // 2nd
                      LOAD E        </tt><tt><<<<<<<<br>
                    </tt><tt>        cfdbr   %r0, 5, %f0        //
                      CONVERT </tt><tt><br>
                    </tt><tt>        strl    %r0, g             // 2nd
                      USE </tt><tt><br>
                    </tt><tt>        br      %r14  </tt><tt><br>
                    </tt><tt><br>
                    </tt><tt>gcc -O3-march=z13 :</tt><tt><br>
                    </tt><tt><br>
                    </tt><tt>h:</tt><tt><br>
                    </tt><tt>.LFB0:</tt><tt><br>
                    </tt><tt>        .cfi_startproc</tt><tt><br>
                    </tt><tt>        larl    %r1,e</tt><tt><br>
                    </tt><tt>        ld      %f0,0(%r1)</tt><tt>       
                      // LOAD E</tt><tt><br>
                    </tt><tt>        lrl     %r2,g</tt><tt><br>
                    </tt><tt>        lgrl    %r3,f</tt><tt><br>
                    </tt><tt>        cfdbr   %r1,5,%f0</tt><tt>        
                      // CONVERT E<br>
                    </tt><tt>        cdfbr   %f0,%r2</tt><tt><br>
                    </tt><tt>        lgfr    %r2,%r1           // EXTEND
                      E</tt><tt><br>
                    </tt><tt>        sllg    %r2,%r2,3</tt><tt><br>
                    </tt><tt>        std     %f0,0(%r2,%r3)    // STORE
                      F ELEMENT</tt><tt><br>
                    </tt><tt>        strl    %r1,g             // 2nd
                      USE</tt><tt><br>
                    </tt><tt>        br      %r14</tt><tt><br>
                      <br>
                    </tt>I hope somebody with enough experience and
                    knowledge can guide the way here as this seems to be
                    quite important.<br>
                    <br>
                    /Jonas<br>
                    <br>
                    <br>
                    <font size="2"><br>
                    </font> </div>
                  _______________________________________________<br>
                  LLVM Developers mailing list<br>
                  <a href="mailto:llvm-dev@lists.llvm.org"
                    target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
                  <a
                    href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
                </blockquote>
              </div>
              <br clear="all">
              <div><br>
              </div>
              -- <br>
              <div dir="ltr"
                class="m_-3343089462936654494gmail_signature"
                data-smartmail="gmail_signature">
                <div dir="ltr">
                  <div>
                    <div dir="ltr">
                      <div>Quantitative analyst, Ph.D.<br>
                        Blog: <a href="http://blog.audio-tk.com/"
                          target="_blank" moz-do-not-send="true">http://blog.audio-tk.com/</a><br>
                        LinkedIn: <a
                          href="http://www.linkedin.com/in/matthieubrucher"
                          target="_blank" moz-do-not-send="true">http://www.linkedin.com/in/matthieubrucher</a></div>
                    </div>
                  </div>
                </div>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
      <br clear="all">
      <div><br>
      </div>
      -- <br>
      <div dir="ltr" class="gmail_signature"
        data-smartmail="gmail_signature">
        <div dir="ltr">
          <div>
            <div dir="ltr">
              <div>Quantitative analyst, Ph.D.<br>
                Blog: <a href="http://blog.audio-tk.com/"
                  target="_blank" moz-do-not-send="true">http://blog.audio-tk.com/</a><br>
                LinkedIn: <a
                  href="http://www.linkedin.com/in/matthieubrucher"
                  target="_blank" moz-do-not-send="true">http://www.linkedin.com/in/matthieubrucher</a></div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>