<html><head><style>body{font-family:Helvetica,Arial;font-size:13px}</style></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div><p style="color:#000;">On April 25, 2014 at 10:48:18 AM, Reid Kleckner (<a href="mailto:rnk@google.com">rnk@google.com</a>) wrote:</p> <div><blockquote type="cite" class="clean_bq" style="color: rgb(0, 0, 0); font-family: Helvetica, Arial; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255);"><span><div><div></div><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Apr 25, 2014 at 10:19 AM, Filip Pizlo<span class="Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:fpizlo@apple.com" target="_blank">fpizlo@apple.com</a>></span><span class="Apple-converted-space"> </span>wrote:<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div style="word-wrap: break-word;"><p>The sdiv operation in LLVM IR only makes sense for C and its very direct relatives. The amount of control flow necessary to represent a safe division for any other language is ghastly. (a/b) becomes something like (b != 0 ? ((a != INT_MIN || b != -1) ? a / b : INT_MIN) : 0). It's more useful to the optimizer to see this as the single operation that it really is.</p></div></blockquote><div>This argument makes sense to me. It would be hard for the arm64 backend to pattern match all this control flow back into a single instruction after optimizations.</div><div><br></div><div>Are there ISAs out there that produce different results without trapping for these boundary conditions? Should we leave the quotient result unspecified in the b == 0 || (a == INT_MIN && B == -1) cases? If we do that, we require frontends to check the "overflow" bit if they want deterministic cross-platform behavior.</div></div></div></div></div></div></span></blockquote></div><p>That's a good question. I believe that all ISAs go to one of the following paths:</p><p>- Trap on x/0 and INT_MIN/-1 like x86.</p><p>- Return 0 for x/0 and INT_MIN for INT_MIN/-1 like ARM.</p><p>Also, here's what I know about what different languages do:</p><p>C/C++/etc: undef on x/0 and INT_MIN/-1, so they would use LLVM's existing div instruction.</p><p>JavaScript: optimizing compilers will look for places where people say things like (a/b)|0 - in JavaScript saying x|0 means "truncate to int32", and they will do an integer division if the inputs were already represented as integers. x/0 will produce NaN and NaN|0 produces 0; INT_MIN/-1 produces -INT_MIN (which is representable as double) and then (-INT_MIN)|0 produces INT_MIN. So, the ARM64 semantics match optimized JavaScript semantics. There is also the case that you say a/b in JavaScript, and we have inferred that a and b are integers but there is no |0. In that case we want to detect when x/0 and INT_MIN/-1 so that we can promote the result to double.</p><p>Java: x/0 throws an exception but INT_MIN/-1 produces INT_MIN.</p><p>Ruby and probably a lot of others: provide BigInt semantics but have fast paths for small ints. So, these will throw an exception on x/0 and they will inflate to a BigInt on INT_MIN/-1.</p><p>I don't know about other languages.</p><p>Looking at this, it makes me *slightly* prefer to generalize these intrinsics a bit. Instead of returning an i1 flag for all overflow cases, you could return an i2 status enum that says 0 for neither overflow condition, 1 for x/0, and 2 for INT_MIN/-1. Then, you get the following matrix for how Java and JavaScript would map to this and how it would be lowered:</p><p>- Java on x86: use safediv and check if the if the status is 1. If it's 1 then throw. This lowers to a bunch of branches fairly late in the backend.</p><p>- JavaScript on x86 with |0: use safediv and ignore the status. Also loweres to a bunch of branches, which is as good as as it gets.</p><p>- JavaScript on x86 without |0: use safediv and deoptimize if status != 0. Also a bunch of branches.</p><p>- Ruby on x86: use safediv and detect both status==1 and status==2. This lowers to a bunch of branches.</p><p>- Java on ARM: use safediv as on x86, but not it lowers to just a branch to detect x/0 and not the other case. Note that without the safediv intrinsic, you wouldn't be able to get this to one branch unless you used a target intrinsic.</p><p>- JavaScript on ARM with |0: use safediv and you ignore the status and you get no branches. It maps directly to ARM.</p><p>- JavaScript on ARM without |0: use safediv and test the status, so you get a bunch of branches as before.</p><p>- Ruby on ARM: this lowers to a bunch of branches as on x86.</p><p>So, it seems that for JavaScript and Java on ARM - and probably any environment with emulated division, these intrinsics would result in tighter code without requiring target intrinsics. For all other language/hardware combinations, the main benefit here is to materialize the gnarly control flow later and so hopefully reveal more optimization opportunities. That seems like a pretty good sweet-spot for the systems that I know about.</p><p>-Filip</p><p><br></p><div><blockquote type="cite" class="clean_bq" style="color: rgb(0, 0, 0); font-family: Helvetica, Arial; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255);"><span><div><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"></div></div></div></div></div></span></blockquote></div></body></html>