[LLVMdev] Proposal: add intrinsics for safe division

Filip Pizlo fpizlo at apple.com
Fri Apr 25 11:49:18 PDT 2014


On April 25, 2014 at 10:48:18 AM, Reid Kleckner (rnk at google.com) wrote:

On Fri, Apr 25, 2014 at 10:19 AM, Filip Pizlo <fpizlo at apple.com> wrote:
The sdiv operation in LLVM IR only makes sense for C and its very direct relatives.  The amount of control flow necessary to represent a safe division for any other language is ghastly.  (a/b) becomes something like (b != 0 ? ((a != INT_MIN || b != -1) ? a / b : INT_MIN) : 0).  It's more useful to the optimizer to see this as the single operation that it really is.

This argument makes sense to me. It would be hard for the arm64 backend to pattern match all this control flow back into a single instruction after optimizations.

Are there ISAs out there that produce different results without trapping for these boundary conditions? Should we leave the quotient result unspecified in the b == 0 || (a == INT_MIN && B == -1) cases? If we do that, we require frontends to check the "overflow" bit if they want deterministic cross-platform behavior.
That's a good question.  I believe that all ISAs go to one of the following paths:

- Trap on x/0 and INT_MIN/-1 like x86.

- Return 0 for x/0 and INT_MIN for INT_MIN/-1 like ARM.

Also, here's what I know about what different languages do:

C/C++/etc: undef on x/0 and INT_MIN/-1, so they would use LLVM's existing div instruction.

JavaScript: optimizing compilers will look for places where people say things like (a/b)|0 - in JavaScript saying x|0 means "truncate to int32", and they will do an integer division if the inputs were already represented as integers.  x/0 will produce NaN and NaN|0 produces 0; INT_MIN/-1 produces -INT_MIN (which is representable as double) and then (-INT_MIN)|0 produces INT_MIN.  So, the ARM64 semantics match optimized JavaScript semantics.  There is also the case that you say a/b in JavaScript, and we have inferred that a and b are integers but there is no |0.  In that case we want to detect when x/0 and INT_MIN/-1 so that we can promote the result to double.

Java: x/0 throws an exception but INT_MIN/-1 produces INT_MIN.

Ruby and probably a lot of others: provide BigInt semantics but have fast paths for small ints.  So, these will throw an exception on x/0 and they will inflate to a BigInt on INT_MIN/-1.

I don't know about other languages.

Looking at this, it makes me *slightly* prefer to generalize these intrinsics a bit.  Instead of returning an i1 flag for all overflow cases, you could return an i2 status enum that says 0 for neither overflow condition, 1 for x/0, and 2 for INT_MIN/-1.  Then, you get the following matrix for how Java and JavaScript would map to this and how it would be lowered:

- Java on x86: use safediv and check if the if the status is 1.  If it's 1 then throw.  This lowers to a bunch of branches fairly late in the backend.

- JavaScript on x86 with |0: use safediv and ignore the status.  Also loweres to a bunch of branches, which is as good as as it gets.

- JavaScript on x86 without |0: use safediv and deoptimize if status != 0.  Also a bunch of branches.

- Ruby on x86: use safediv and detect both status==1 and status==2.  This lowers to a bunch of branches.

- Java on ARM: use safediv as on x86, but not it lowers to just a branch to detect x/0 and not the other case.  Note that without the safediv intrinsic, you wouldn't be able to get this to one branch unless you used a target intrinsic.

- JavaScript on ARM with |0: use safediv and you ignore the status and you get no branches.  It maps directly to ARM.

- JavaScript on ARM without |0: use safediv and test the status, so you get a bunch of branches as before.

- Ruby on ARM: this lowers to a bunch of branches as on x86.

So, it seems that for JavaScript and Java on ARM - and probably any environment with emulated division, these intrinsics would result in tighter code without requiring target intrinsics.  For all other language/hardware combinations, the main benefit here is to materialize the gnarly control flow later and so hopefully reveal more optimization opportunities.  That seems like a pretty good sweet-spot for the systems that I know about.

-Filip



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140425/707883f3/attachment.html>


More information about the llvm-dev mailing list