<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><span class="vcard"><a class="email" href="mailto:wenzel.jakob@epfl.ch" title="Wenzel Jakob <wenzel.jakob@epfl.ch>"> <span class="fn">Wenzel Jakob</span></a>
</span> changed
<a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED --- - AVX512: Update __vectorcall calling conventions"
href="https://llvm.org/bugs/show_bug.cgi?id=28963">bug 28963</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">Status</td>
<td>RESOLVED
</td>
<td>REOPENED
</td>
</tr>
<tr>
<td style="text-align:right;">Resolution</td>
<td>INVALID
</td>
<td>---
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED --- - AVX512: Update __vectorcall calling conventions"
href="https://llvm.org/bugs/show_bug.cgi?id=28963#c2">Comment # 2</a>
on <a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED --- - AVX512: Update __vectorcall calling conventions"
href="https://llvm.org/bugs/show_bug.cgi?id=28963">bug 28963</a>
from <span class="vcard"><a class="email" href="mailto:wenzel.jakob@epfl.ch" title="Wenzel Jakob <wenzel.jakob@epfl.ch>"> <span class="fn">Wenzel Jakob</span></a>
</span></b>
<pre>Dear David,
I've reopened the ticket since the issue is a bit more tricky in my opinion. I
could also open another ticket with a broader scope and close this one if you
would prefer that.
The problem is basically as follows: In comparison to the other big 2 compilers
with AVX512 support (Intel & GCC), Clang does not provide calling conventions
which allow passing AVX512 SIMD vectors via registers. This introduces
unnecessary performance penalties in functions that expect vector arguments.
In comparison:
1. GCC does this by default, even without specifying specific calling
conventions. (!). This is ideal from a performance perspective albeit perhaps a
bit unorthodox.
2. Intel compiler has a __regcall directive which allows ZMM registers to be
used for function arguments.
3. Visual Studio has the __vectorcall directive which does this for SSE & AVX.
Although it seems fairly obvious to me that this approach will eventually be
carried over to AVX512, it's uncertain when MSVC will actually support this
instruction set. Consider that even when doing nothing, Clang's __vectorcall
will be eventually be incompatible with what Microsoft deploys at that point.
I wonder: is it worth waiting for Microsoft here? IMHO it would make a lot more
sense to unlock __vectorcall for AVX512 arguments and revise the calling
conventions if/when Microsoft officially supports AVX512.
Thoughts?
Best,
Wenzel</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>