[llvm-bugs] [Bug 37402] New: [inline asm, AVX] Issue with the SSE register allocation in inline assembly
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed May 9 16:20:44 PDT 2018
https://bugs.llvm.org/show_bug.cgi?id=37402
Bug ID: 37402
Summary: [inline asm, AVX] Issue with the SSE register
allocation in inline assembly
Product: clang
Version: 6.0
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: -New Bugs
Assignee: unassignedclangbugs at nondot.org
Reporter: raphael_bost at alumni.brown.edu
CC: llvm-bugs at lists.llvm.org
Created attachment 20280
--> https://bugs.llvm.org/attachment.cgi?id=20280&action=edit
Test code
It seems there is a bug when using inline assembly for x86.
Say, we want to implement the following pseudo-code where all
variables (except cond) are __m128
// va = *a;
// vb = *b;
// mask = (cond)?0x00:0xFF;
//
// vc = (va ^ vb) & mask;
// va = va ^ vc;
// vb = vb ^ vc;
//
// *a = va;
// *b = vb;
Branch free, memory access oblivious code is targeted: use of
inline assembly seems recommended for such cases.
In this bug report, we will focus on the three middle lines from
above. Let us consider two implementations, and the generated assembly,
with -mavx -O3.
// Implementation 1:
asm volatile("vpxor %0, %1, %2\n\t"
"vpand %2, %3, %2\n\t"
"vpxor %0, %2, %0\n\t"
"vpxor %1, %2, %1\n\t"
: "+x"(va), "+x"(vb), "+x"(vc) /* output */
: "x"(mask) /* input */
: /* clobbered register */
);
// Generated ASM 1
vpxor %xmm0, %xmm1, %xmm2
vpand %xmm2, %xmm2, %xmm2 <- Idempotent and useless instruction
vpxor %xmm0, %xmm2, %xmm0
vpxor %xmm1, %xmm2, %xmm1
// Implementation 2:
asm volatile("vpxor %0, %1, %2\n\t"
"vpand %2, %3, %2\n\t"
"vpxor %0, %2, %0\n\t"
"vpxor %1, %2, %1\n\t"
: "+x"(va), "+x"(vb), "=x"(vc), "+x"(mask) /* output */
: /* input */
: /* clobbered register */
);
// Generated ASM 2
vpxor %xmm0, %xmm1, %xmm3
vpand %xmm3, %xmm2, %xmm3
vpxor %xmm0, %xmm3, %xmm0
vpxor %xmm1, %xmm3, %xmm1
Clearly, Implementation 1 does not work, and produces an invalid assembly.
I had the same generated assembly with clang 3.7, 5.0, 6.0 and trunk.
gcc and icc generate identical asm (cf. https://godbolt.org/g/gvkRph and the
attachements),
equivalent to ASM 2.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180509/d63fe740/attachment.html>
More information about the llvm-bugs
mailing list