[LLVMdev] Add a 'notrap' function attribute?
Pekka Jääskeläinen
pekka.jaaskelainen at tut.fi
Thu Oct 31 07:38:42 PDT 2013
Hello,
OpenCL C specifies that instructions should not trap (it is "discouraged"
in the specs). If they do, it's vendor-specific how the hardware
exceptions are handled.
It might be also the case with some other (future) languages targeting
"streamlined" parallel accelerators in an heterogeneous setting.
At least CUDA comes to mind. What about OpenACC and the new OpenMP,
does someone know offhand?
It would help several optimizations if they could assume certain
instructions do not trap. E.g., I was looking at the if-conversion of
the loop vectorizer, and it seems to not support speculating stores,
divs, etc. which could be done if we knew it's safe to speculatively
execute them.
[In this particular if-conversion case proper predicated execution
(not speculative) would require predicates to be added for all LLVM
instructions so they could be squashed. I think this was discussed
several years ago in the context of a generic IR-level if-conversion
pass, but it seems such a thing did not realize eventually.]
Anyways, "speculative" if-conversion is just one example where knowing
that traps need not to be considered in the function at hand
would help the optimizations. Also other speculative code motion
optimizations, e.g., LICM, could benefit from it.
One way would be to introduce a new function attribute. Functions (e.g.,
OpenCL C or CUDA kernels) could be marked with an attribute that states
that the instructions can be assumed not to trap -- it's a programmer's or
the runtime's mistake if they do. The runtime should change the fp
computation mode to the non-trapping one before calling such
a function (this is actually stated in the OpenCL specs). If such
handling is not supported by the target, then the attribute should not
be added the first place.
The attribute could be called 'notrap' which would include the
semantics of any trap caused by any instruction. Or that could be
split, just in case the hardware is known not to support one of the
features. Three could suffice: 'nofptrap' (no IEEE FP exceptions),
'nodivtrap' (no divide by zero exceptions, undef value output instead),
'nomemtrap' (no mem exceptions).
What do you think of the general idea? Or is there something similar
already that can accomplish this?
Thanks in advance,
--
Pekka
More information about the llvm-dev
mailing list