<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Function multi-versioning miscompiles subtarget-dependent calling conventions"
href="https://llvm.org/bugs/show_bug.cgi?id=23083">23083</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Function multi-versioning miscompiles subtarget-dependent calling conventions
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Common Code Generator Code
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>michael.m.kuperstein@intel.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Given this IR input:
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc"
define <8 x float> @foo(<8 x float> %a, <8 x float> %b) #0 {
entry:
%ret = call x86_vectorcallcc <8 x float> @bar(<8 x float> %a, <8 x float> %b)
ret <8 x float> %ret
}
define x86_vectorcallcc <8 x float> @bar(<8 x float> %a, <8 x float> %b) #1 {
%add = fadd <8 x float> %a, %b
ret <8 x float> %add
}
attributes #0 = { nounwind "target-features"="-avx"}
attributes #1 = { nounwind "target-features"="+avx" }
We currently get:
foo: # @foo
# BB#0: # %entry
subq $40, %rsp
movaps (%rcx), %xmm0
movaps (%rdx), %xmm1
movaps (%r8), %xmm2
movaps (%r9), %xmm3
callq bar@@64
addq $40, %rsp
retq
.def bar@@64;
.scl 2;
.type 32;
.endef
.globl bar@@64
.align 16, 0x90
bar@@64: # @bar
# BB#0:
vaddps %ymm1, %ymm0, %ymm0
retq
The caller passes <8 x float> in 2 xmms, while the callee expects a single ymm.
Note that this is not restricted to passing in-register. When passing on the
stack(removing the vectorcall CC) we get a similar - albeit possibly more
"fixable" - miscompile:
foo: # @foo
# BB#0: # %entry
subq $104, %rsp
movaps (%r9), %xmm0
movaps (%r8), %xmm1
movaps (%rdx), %xmm2
movaps (%rcx), %xmm3
movaps %xmm3, 80(%rsp)
movaps %xmm2, 64(%rsp)
movaps %xmm1, 48(%rsp)
movaps %xmm0, 32(%rsp)
leaq 80(%rsp), %rcx
leaq 64(%rsp), %rdx
leaq 48(%rsp), %r8
leaq 32(%rsp), %r9
callq bar
addq $104, %rsp
retq
.def bar;
.scl 2;
.type 32;
.endef
.globl bar
.align 16, 0x90
bar: # @bar
# BB#0:
vmovaps (%rcx), %ymm0
vaddps (%rdx), %ymm0, %ymm0
retq</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>