<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><br>
</p>
<div class="moz-cite-prefix">On 7/23/19 8:42 PM, John McCall via
llvm-dev wrote:<br>
</div>
<blockquote type="cite"
cite="mid:C38CF203-CD9F-4938-9196-790681C800F9@apple.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div style="font-family:sans-serif">
<div style="white-space:normal">
<p dir="auto">On 21 Jul 2019, at 12:29, James Y Knight via
llvm-dev wrote:</p>
</div>
<div style="white-space:normal">
<blockquote style="border-left:2px solid #777; color:#777;
margin:0 0 5px; padding-left:5px">
<p dir="auto">Yes, indeed!<br>
<br>
The SBCL lisp compiler (not llvm based) used to emit
functions which would<br>
return either via ret to the usual instruction after the
call, or else load<br>
the return-address from the stack, then jump 2 bytes later
(which would<br>
skip over either a nop or a short jmp at original target
location). Which<br>
one it used depended upon whether the function was doing a
multi-valued<br>
return (in which case it used ret) or a single-valued
return (in which case<br>
it did the jmp retpc+2).<br>
<br>
While this seems like a clever and efficient hack, it
actually has an<br>
absolutely awful effect on performance, due to the
unpaired call vs return,<br>
and the unexpected return address.<br>
<br>
SBCL stopped doing this in 2006, a decade later than it
should've -- the<br>
Pentium1 MMX from 1997 already had a hardware return stack
which made this<br>
a really bad idea!<br>
<br>
What it does now is have the called function set or clear
the carry flag<br>
(using STC and CLC) immediately before the return. If the
caller cares,<br>
then the caller emits JNC as the first instruction after
the call. (but<br>
callers typically do not care -- most calls only consume a
single value,<br>
and any extra return-values are silently ignored).</p>
</blockquote>
</div>
<div style="white-space:normal">
<p dir="auto">On Swift, we've occasionally considered whether
it would be useful to be<br>
able to return values in flags. For example, you could
imagine returning<br>
a trinary comparison result on x86_64 based on whether ZF
and CF are set.<br>
A function which compares two pairs of unsigned numbers
could be compiled<br>
to something like:</p>
<pre style="background-color:#F7F7F7; border-radius:5px 5px 5px 5px; margin-left:15px; margin-right:15px; max-width:90vw; overflow-x:auto; padding:5px" bgcolor="#F7F7F7"><code style="background-color:#F7F7F7; border-radius:3px; margin:0; padding:0" bgcolor="#F7F7F7"> cmpq %rdi, %rdx
jz end
cmpq %rsi, %rcx
end:
ret
</code></pre>
<p dir="auto">And the caller can switch over the values just
by testing the flags.</p>
<p dir="auto">The main problem is that this is really elegant
if you have an<br>
instruction that sets the flags exactly right and really
terrible<br>
if you don't. For example, if we want this function to
compare two<br>
pairs of <em>signed</em> numbers, we need to move OF to CF
without disturbing<br>
ZF, which I don't think is possible without some really ugly<br>
instruction sequences. (Or we could add
0x8000_0000_0000_0000 to both<br>
operands before the comparison, but that's terrible in its
own right.)</p>
<p dir="auto">That problem isn't as bad if it's just a single
boolean in ZF or CF, but<br>
it's still not great, at least on x86.</p>
<p dir="auto">Now, specialized purposes like SBCL's can
definitely still benefit from<br>
being able to return in a flag. If LLVM had had the ability
to return<br>
values in flags, we might've used it in Swift's coroutines
ABI, where<br>
(similar to SBCL) any particular return site does know
exactly which<br>
value it wants to return. So it'd be nice if someone was
interested in<br>
adding it.</p>
<p dir="auto">But we did ultimately decide that it wasn't even
worth prototyping it<br>
for the generic Swift CC.</p>
</div>
</div>
</blockquote>
<p>We've also got some cases where returning a value in a flag might
be useful. Our typical use case is we have a "rare, but not
*that* rare* slowpath which sometimes needs to run after a call
from a runtime function. Our other compiler(s) - which use hand
rolled assembly for all of these bits - return the "take-rare" bit
in ZF, and branch on that after the call. For our LLVM based
system, we just materialize the value into $rax and branch on
that. That naive scheme has been surprisingly not bad performance
wise.</p>
<p>* The "not *that* rare" part is needed to avoid having
exceptional unwinding be the right answer. <br>
</p>
<p>If we were to support something like this, you'd really want to
be able to define individual flags in the callee's calling
convention clobber/preserve lists. It's really common to have a
helper routine which sets say ZF, but leaves others unchanged. Or
to have a function which sets ZF, clobbers OF, and preserves all
others. But if we were going to do that, we'd quickly realize
that the x86 backend doesn't track individual flags at all, and
thus conclude it probably wasn't worth it begin with. :)</p>
<p>Philip<br>
</p>
<p><br>
</p>
<p><br>
</p>
</body>
</html>