<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - X86-32: Clang passes fewer vectors in SSE registers than GCC"
   href="http://llvm.org/bugs/show_bug.cgi?id=21510">21510</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>X86-32: Clang passes fewer vectors in SSE registers than GCC
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>rnk@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>grosbach@apple.com, llvmbugs@cs.uiuc.edu, paul_robinson@playstation.sony.com, rafael.espindola@gmail.com, rjmccall@apple.com
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Clang's 32-bit x86 calling convention rules only pass 4 vectors in registers:

def CC_X86_32_Common : CallingConv<[
...
  // The first 4 SSE vector arguments are passed in XMM registers.
  CCIfNotVarArg<CCIfType<[v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],
                CCAssignToReg<[XMM0, XMM1, XMM2, XMM3]>>>,

  // The first 4 AVX 256-bit vector arguments are passed in YMM registers.
  CCIfNotVarArg<CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
                CCIfSubtarget<"hasFp256()",
                CCAssignToReg<[YMM0, YMM1, YMM2, YMM3]>>>>,
...

GCC will use all of XMM0-XMM7 to pass arguments, so Clang is ABI incompatible.

It looks like Clang was trying to match gcc's behavior with -msseregparm, which
is documented to only use the first four XMM registers for floating point
arguments. However, for true vector arguments (not float / double), gcc
currently uses up to eight registers.

So, are we broken here? If so, can we fix it, or are users relying heavily on
our x86_32 ABI stability? Can we fix the problem for YMM and ZMM registers at
least? Should we make this conditional on OS to make clang backwards compatible
with itself on Mac, BSD, etc, and compatible with the dominant system compiler
(GCC) on Linux?</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>