[LLVMbugs] [Bug 11862] New: [AVX] incorrect code in attempted callee-save of ymm registers on Windows

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Jan 26 11:16:01 PST 2012


             Bug #: 11862
           Summary: [AVX] incorrect code in attempted callee-save of ymm
                    registers on Windows
           Product: new-bugs
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: matt at pharr.org
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Created attachment 7950
  --> http://llvm.org/bugs/attachment.cgi?id=7950
test case

The attached test case shows a simple recursive function that computes a vector
result, using a mask to determine which vector elements to update (and
continuing recursively until all elements have turned off their respective mask

On Windows, the first four vector values are computed correctly, but the last
four are incorrect:

> llc -mattr=+avx a.ll -o foo.obj
> cl a.cpp foo.obj
> a.exe
result[0] = 1.000000, expected 1.000000 
result[1] = 3.000000, expected 3.000000 
result[2] = 6.000000, expected 6.000000 
result[3] = 10.000000, expected 10.000000 
result[4] = 0.000000, expected 15.000000  ERROR
result[5] = 0.000000, expected 21.000000  ERROR
result[6] = 0.000000, expected 28.000000  ERROR
result[7] = 0.000000, expected 36.000000  ERROR

(If I comment out the target triple info in the bitcode, the correct results
are computed on OSX.)

Looking at the assembly on Windows, the issue seems to be that it's trying to
callee save ymm6 and ymm7 before using them, but it is only storing the xmm
part of the registers:

    vmovaps    %xmm7, -32(%rbp)        # 16-byte Spill
    vmovaps    %xmm6, -16(%rbp)        # 16-byte Spill

(So that's obviously a problem when it just restores the xmm part before

However, my weak understanding of Windows ABIs is that only the xmm part of
xmm6-xmm15 is callee save.  So perhaps the real issue is that the instructions
after the recursive call treat all of ymm6 and ymm7 as having been callee

    callq    f___vyf
    vaddps    %ymm6, %ymm0, %ymm0
    vblendvps    %ymm7, %ymm0, %ymm6, %ymm6

Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the llvm-bugs mailing list