[llvm-bugs] [Bug 37402] New: [inline asm, AVX] Issue with the SSE register allocation in inline assembly

via llvm-bugs llvm-bugs at lists.llvm.org
Wed May 9 16:20:44 PDT 2018


https://bugs.llvm.org/show_bug.cgi?id=37402

            Bug ID: 37402
           Summary: [inline asm, AVX] Issue with the SSE register
                    allocation in inline assembly
           Product: clang
           Version: 6.0
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: -New Bugs
          Assignee: unassignedclangbugs at nondot.org
          Reporter: raphael_bost at alumni.brown.edu
                CC: llvm-bugs at lists.llvm.org

Created attachment 20280
  --> https://bugs.llvm.org/attachment.cgi?id=20280&action=edit
Test code

It seems there is a bug when using inline assembly for x86.
Say, we want to implement the following pseudo-code where all
variables (except cond) are __m128

// va = *a;
// vb = *b;
// mask = (cond)?0x00:0xFF;
//
// vc = (va ^ vb) & mask;
// va = va ^ vc;
// vb = vb ^ vc;
//
// *a = va;
// *b = vb;

Branch free, memory access oblivious code is targeted: use of
inline assembly seems recommended for such cases.
In this bug report, we will focus on the three middle lines from
above. Let us consider two implementations, and the generated assembly, 
with -mavx -O3.

// Implementation 1: 
    asm volatile("vpxor %0, %1, %2\n\t"
                 "vpand %2, %3, %2\n\t"
                 "vpxor %0, %2, %0\n\t"
                 "vpxor %1, %2, %1\n\t"
                 : "+x"(va), "+x"(vb), "+x"(vc) /* output */
                 : "x"(mask) /* input */
                 : /* clobbered register */
    );

// Generated ASM 1
    vpxor %xmm0, %xmm1, %xmm2
    vpand %xmm2, %xmm2, %xmm2 <- Idempotent and useless instruction
    vpxor %xmm0, %xmm2, %xmm0
    vpxor %xmm1, %xmm2, %xmm1




// Implementation 2: 
    asm volatile("vpxor %0, %1, %2\n\t"
                 "vpand %2, %3, %2\n\t"
                 "vpxor %0, %2, %0\n\t"
                 "vpxor %1, %2, %1\n\t"
                 : "+x"(va), "+x"(vb), "=x"(vc), "+x"(mask) /* output */
                 : /* input */
                 : /* clobbered register */
    );

// Generated ASM 2
    vpxor %xmm0, %xmm1, %xmm3
    vpand %xmm3, %xmm2, %xmm3
    vpxor %xmm0, %xmm3, %xmm0
    vpxor %xmm1, %xmm3, %xmm1



Clearly, Implementation 1 does not work, and produces an invalid assembly.
I had the same generated assembly with clang 3.7, 5.0, 6.0 and trunk.
gcc and icc generate identical asm (cf. https://godbolt.org/g/gvkRph and the
attachements),
equivalent to ASM 2.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180509/d63fe740/attachment.html>


More information about the llvm-bugs mailing list