<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 7/23/19 8:42 PM, John McCall via

      llvm-dev wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:C38CF203-CD9F-4938-9196-790681C800F9@apple.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <div style="font-family:sans-serif">

        <div style="white-space:normal">

          <p dir="auto">On 21 Jul 2019, at 12:29, James Y Knight via

            llvm-dev wrote:</p>

        </div>

        <div style="white-space:normal">

          <blockquote style="border-left:2px solid #777; color:#777;

            margin:0 0 5px; padding-left:5px">

            <p dir="auto">Yes, indeed!<br>

              <br>

              The SBCL lisp compiler (not llvm based) used to emit

              functions which would<br>

              return either via ret to the usual instruction after the

              call, or else load<br>

              the return-address from the stack, then jump 2 bytes later

              (which would<br>

              skip over either a nop or a short jmp at original target

              location). Which<br>

              one it used depended upon whether the function was doing a

              multi-valued<br>

              return (in which case it used ret) or a single-valued

              return (in which case<br>

              it did the jmp retpc+2).<br>

              <br>

              While this seems like a clever and efficient hack, it

              actually has an<br>

              absolutely awful effect on performance, due to the

              unpaired call vs return,<br>

              and the unexpected return address.<br>

              <br>

              SBCL stopped doing this in 2006, a decade later than it

              should've -- the<br>

              Pentium1 MMX from 1997 already had a hardware return stack

              which made this<br>

              a really bad idea!<br>

              <br>

              What it does now is have the called function set or clear

              the carry flag<br>

              (using STC and CLC) immediately before the return. If the

              caller cares,<br>

              then the caller emits JNC as the first instruction after

              the call. (but<br>

              callers typically do not care -- most calls only consume a

              single value,<br>

              and any extra return-values are silently ignored).</p>

          </blockquote>

        </div>

        <div style="white-space:normal">

          <p dir="auto">On Swift, we've occasionally considered whether

            it would be useful to be<br>

            able to return values in flags. For example, you could

            imagine returning<br>

            a trinary comparison result on x86_64 based on whether ZF

            and CF are set.<br>

            A function which compares two pairs of unsigned numbers

            could be compiled<br>

            to something like:</p>

          <pre style="background-color:#F7F7F7; border-radius:5px 5px 5px 5px; margin-left:15px; margin-right:15px; max-width:90vw; overflow-x:auto; padding:5px" bgcolor="#F7F7F7"><code style="background-color:#F7F7F7; border-radius:3px; margin:0; padding:0" bgcolor="#F7F7F7">  cmpq %rdi, %rdx

  jz end

  cmpq %rsi, %rcx

end:

  ret

</code></pre>

          <p dir="auto">And the caller can switch over the values just

            by testing the flags.</p>

          <p dir="auto">The main problem is that this is really elegant

            if you have an<br>

            instruction that sets the flags exactly right and really

            terrible<br>

            if you don't. For example, if we want this function to

            compare two<br>

            pairs of <em>signed</em> numbers, we need to move OF to CF

            without disturbing<br>

            ZF, which I don't think is possible without some really ugly<br>

            instruction sequences. (Or we could add

            0x8000_0000_0000_0000 to both<br>

            operands before the comparison, but that's terrible in its

            own right.)</p>

          <p dir="auto">That problem isn't as bad if it's just a single

            boolean in ZF or CF, but<br>

            it's still not great, at least on x86.</p>

          <p dir="auto">Now, specialized purposes like SBCL's can

            definitely still benefit from<br>

            being able to return in a flag. If LLVM had had the ability

            to return<br>

            values in flags, we might've used it in Swift's coroutines

            ABI, where<br>

            (similar to SBCL) any particular return site does know

            exactly which<br>

            value it wants to return. So it'd be nice if someone was

            interested in<br>

            adding it.</p>

          <p dir="auto">But we did ultimately decide that it wasn't even

            worth prototyping it<br>

            for the generic Swift CC.</p>

        </div>

      </div>

    </blockquote>

    <p>We've also got some cases where returning a value in a flag might

      be useful.  Our typical use case is we have a "rare, but not

      *that* rare* slowpath which sometimes needs to run after a call

      from a runtime function.  Our other compiler(s) - which use hand

      rolled assembly for all of these bits - return the "take-rare" bit

      in ZF, and branch on that after the call.  For our LLVM based

      system, we just materialize the value into $rax and branch on

      that.  That naive scheme has been surprisingly not bad performance

      wise.</p>

    <p>* The "not *that* rare" part is needed to avoid having

      exceptional unwinding be the right answer.  <br>

    </p>

    <p>If we were to support something like this, you'd really want to

      be able to define individual flags in the callee's calling

      convention clobber/preserve lists.  It's really common to have a

      helper routine which sets say ZF, but leaves others unchanged.  Or

      to have a function which sets ZF, clobbers OF, and preserves all

      others.  But if we were going to do that, we'd quickly realize

      that the x86 backend doesn't track individual flags at all, and

      thus conclude it probably wasn't worth it begin with.  :)</p>

    <p>Philip<br>

    </p>

    <p><br>

    </p>

    <p><br>

    </p>

  </body>

</html>