[llvm-bugs] [Bug 27908] New: Incorrect generated code with AVX
via llvm-bugs
llvm-bugs at lists.llvm.org
Fri May 27 04:46:20 PDT 2016
https://llvm.org/bugs/show_bug.cgi?id=27908
Bug ID: 27908
Summary: Incorrect generated code with AVX
Product: new-bugs
Version: 3.8
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: gael.guennebaud at gmail.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
The following piece code:
#include <iostream>
// http://bitbucket.org/eigen/eigen/get/default.tar.gz
#include <Eigen/Dense>
using namespace Eigen;
int main()
{
Projective3d t4;
Vector3d v3;
do {
v3 = Vector3d::Ones();
} while (v3.cwiseAbs().minCoeff()<1e-16);
t4.matrix().setIdentity();
t4.matrix().col(3).head<3>() = v3;
std::cout << t4.matrix() << "\n\n";
t4.translate(v3);
// t4.translationExt() += t4.linearExt() * v3;
}
compiled with clang 3.7 or 3.8 or 3.9 with "-mavx -O2" generates the following
output:
1 0 0 1
0 1 0 5.29981e-315
0 0 1 1
0 0 0 1
where the "5.29981e-315" number should be "1". t4.matrix() is essentially a
4x4 column matrix stored as s static array of 16 doubles. The incorrect asm
part responsible for filling the last 4 entries is as follows:
movq %rax, 120(%rsp)
movl $1072693248, %ecx ## imm = 0x3FF00000
vmovq %rcx, %xmm0
vpslldq $8, %xmm0, %xmm0 ## xmm0 =
zero,zero,zero,zero,zero,zero,zero,zero,xmm0[0,1,2,3,4,5,6,7]
vpshufd $226, %xmm0, %xmm0 ## xmm0 = xmm0[2,0,2,3]
vmovdqa LCPI0_0(%rip), %xmm1 ## xmm1 = [0,1072693248,0,1072693248]
vpunpcklqdq %xmm0, %xmm1, %xmm0 ## xmm0 = xmm1[0],xmm0[0]
vmovdqa %xmm0, 96(%rsp)
movq %rax, 112(%rsp)
where %rax contains a representation of 1.0, and 96(%rsp) references the first
element of the last column.
Replacing the last line, that is "t4.translate(v3);" by the body of the
translate method (last commented line), hides the issue, and in this case we
get a much cleaner asm:
movq %rax, 120(%rsp)
movq %rax, 96(%rsp)
vmovaps LCPI0_2(%rip), %xmm0 ## xmm0 =
[4607182418800017408,4607182418800017408]
vmovups %xmm0, 104(%rsp)
I guess that the weird shifting and shuffling we see in the broken part comes
from the do-while condition which is only partly optimized away.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160527/9bfaf952/attachment-0001.html>
More information about the llvm-bugs
mailing list