[LLVMdev] [Proposal] Speculative execution of function calls

Nick Lewycky nlewycky at google.com
Thu Aug 8 19:55:00 PDT 2013


On 31 July 2013 05:32, Renato Golin <renato.golin at linaro.org> wrote:

> On 31 July 2013 11:56, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:
>
>> The slightly orthogonal question to safety is the cost of execution.  For
>> most intrinsics that represent CPU instructions, executing them
>> speculatively is cheaper than a conditional jump, but this is not the case
>> for all (for example, some forms of divide instructions on in-order RISC
>> processors).  For other functions, it's even worse because the cost may be
>> dependent on the input.  Consider as a trivial example the well-loved
>> recursive Fibonacci function.  This is always safe to call speculatively,
>> because it only touches local variables.  It is, however, probably never a
>> good idea to do so.  It's also likely that the cost of a real function call
>> is far more expensive than the elided jump, although this may not be the
>> case on GPUs where divergent flow control is more expensive than redundant
>> execution.  Making this decision requires knowledge of both the target
>> architecture and the complexity of the function, which may be dependent on
>> its inputs.  Even in your examples, some of the functions are only safe to
>> speculatively execute for some subset of their inputs, and you haven't
>> proposed a way of determining this.
>>
>
> David,
>
> If I got it right, this is a proposal for a framework to annotate
> speculative-safe functions, not a pass that will identify all cases. So,
> yes, different back-ends can annotate their safe intrinsics, front-ends can
> annotate their safe calls, and it'll always be a small subset, as with most
> of other optimizations.
>
> As for letting optimization passes use that info, well, it could in theory
> be possible to count the number of instructions on the callee, and make
> sure it has no other calls, side-effects or undefined behaviour, and again,
> that would have to be very conservative.
>

Correct. Safety and cost are two different things. It's important to
remember that safety and cost are two different things.

The issue Michael raised is a great one, can we safely assume no undefined
behaviour? Do we have intrinsics today which will exhibit undefined
behaviour if called with certain arguments? Absolutely, @llvm.memcpy for
one. But is there one which is "nounwind readnone" yet can't be safely
speculated? It may be that all nounwind+readnone intrinsics are also
speculatable, but it would help justify this patch if we had a
counterexample to point to.

That aside, we have a situation where nounwind+readnone intrinsics may be
speculated, but nounwind+readnone calls may not. This doesn't quite solve
the problem for all users. LLVM may be used in an environment which
provides intrinsic-like functions as part of the language, but for whatever
reason don't belong as intrinsics (either because they aren't core llvm and
can't be upstreamed -- a mythical get_gpu_context -- or because they aren't
part of a single target). I don't have this problem myself though. Michael,
could you give concrete examples from OpenCL?

Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/8011c532/attachment.html>


More information about the llvm-dev mailing list