<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Incorrect generated code with AVX"
   href="https://llvm.org/bugs/show_bug.cgi?id=27908">27908</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Incorrect generated code with AVX
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>new-bugs
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>3.8
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>new bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>gael.guennebaud@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>The following piece code:

#include <iostream>
// <a href="http://bitbucket.org/eigen/eigen/get/default.tar.gz">http://bitbucket.org/eigen/eigen/get/default.tar.gz</a>
#include <Eigen/Dense>
using namespace Eigen;
int main()
{
  Projective3d t4;
  Vector3d v3;
  do {
    v3 = Vector3d::Ones();
  } while (v3.cwiseAbs().minCoeff()<1e-16);
  t4.matrix().setIdentity();
  t4.matrix().col(3).head<3>() = v3;
  std::cout << t4.matrix() << "\n\n";
  t4.translate(v3);
//     t4.translationExt() += t4.linearExt() * v3;
}

compiled with clang 3.7 or 3.8 or 3.9 with "-mavx -O2" generates the following
output:

           1            0            0            1
           0            1            0 5.29981e-315
           0            0            1            1
           0            0            0            1

where the  "5.29981e-315" number should be "1". t4.matrix() is essentially a
4x4 column matrix stored as s static array of 16 doubles. The incorrect asm
part responsible for filling the last 4 entries is as follows:

    movq    %rax, 120(%rsp)
    movl    $1072693248, %ecx       ## imm = 0x3FF00000
    vmovq    %rcx, %xmm0
    vpslldq    $8, %xmm0, %xmm0        ## xmm0 =
zero,zero,zero,zero,zero,zero,zero,zero,xmm0[0,1,2,3,4,5,6,7]
    vpshufd    $226, %xmm0, %xmm0      ## xmm0 = xmm0[2,0,2,3]
    vmovdqa    LCPI0_0(%rip), %xmm1    ## xmm1 = [0,1072693248,0,1072693248]
    vpunpcklqdq    %xmm0, %xmm1, %xmm0 ## xmm0 = xmm1[0],xmm0[0]
    vmovdqa    %xmm0, 96(%rsp)
    movq    %rax, 112(%rsp)

where %rax contains a representation of 1.0, and 96(%rsp) references the first
element of the last column.

Replacing the last line, that is "t4.translate(v3);" by the body of the
translate method (last commented line), hides the issue, and in this case we
get a much cleaner asm:

    movq    %rax, 120(%rsp)
    movq    %rax, 96(%rsp)
    vmovaps    LCPI0_2(%rip), %xmm0    ## xmm0 =
[4607182418800017408,4607182418800017408]
    vmovups    %xmm0, 104(%rsp)

I guess that the weird shifting and shuffling we see in the broken part comes
from the do-while condition which is only  partly optimized away.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>