[LLVMdev] Integer divide by zero
Cameron McInally
cameron.mcinally at nyu.edu
Sun Apr 7 21:28:48 PDT 2013
Well put, Chandler!
On Sun, Apr 7, 2013 at 6:23 PM, Chandler Carruth <chandlerc at google.com>wrote:
> I think this entire conversation is going a bit off the rails. Let's try
> to stay focused on the specific request, and why there (may) be problems
> with it.
>
> On Sun, Apr 7, 2013 at 11:50 AM, Cameron McInally <
> cameron.mcinally at nyu.edu> wrote:
>
>> To be clear, you're asking to turn off a set of optimizations. That
>>> is, you're asking to make code in general run slower, so that you can
>>> get a particular behavior on some CPUs in unusual cases.
>>>
>>
>> I respectfully disagree. I am asking for an *option* to turn off a set of
>> optimizations; not turn off optimizations in general. I would like to make
>> it easy for a compiler implementor to choose the desired behaviour. I
>> whole-heartedly believe that both behaviours (undefined and trap) have
>> merit.
>>
>
> I think you're both misconstruing what this would involve.
>
> You're actually asking for the formal model of the LLVM IR to be
> *parameterized*. In one mode, an instruction would produce undefined
> behavior on division, and in another mode it would produce a trap. Then you
> are asking for the optimizer stack to support either semantic model, and
> produce efficient code regardless.
>
> This is completely intractable for LLVM to support. It would make both the
> optimizers and the developers of LLVM crazy to have deep parameterization
> of the fundamental semantic model for integer division.
>
> The correct way to support *exactly* reproducing the architectural
> peculiarities of the x86-64 integer divide instruction is to add a
> target-specific intrinsic which does this. It will have defined behavior
> (of trapping in some cases) as you want, and you can emit this in your FE
> easily. However, even this has the risk of incurring a high maintenance
> burden. If you want much in the way of optimizations of this intrinsic,
> you'll have to go through the optimizer and teach each pass about your
> intrinsic. Some of these will be easy, but some will be hard and there will
> be a *lot* of them. =/
>
>
> Cameron, you (and others interested) will certainly need to provide all of
> the patches and work to support this if you think this is an important use
> case, as the existing developers have found other trade-offs and solutions.
> And even then, if it requires really substantial changes to the optimizer,
> I'm not sure it's worth pursuing this in LLVM. My primary concerns are
> two-fold. First, I think that the amount of work required to recover the
> optimizations which could theoretically apply to both of these operations
> will be massive. Second, I fear that after having done this work, you will
> immediately find the need to remove some other undefined behavior from the
> IR which happens to have defined behavior on x86-64.
>
Alas, I must have been shortsighted. For my purposes, I had envisioned
using this target-specific intrinsic only when undefined behaviour was
imminent. That information is available before the IR and it would
work-around the constant folder. I did not anticipate needing optimizations
around that intrinsic, since it would ultimately trap.
Supporting the intrinsic as a proper alternative to the integer division
operator(s) sounds like a lot of work. I do not believe that the reward is
worth the effort, at least for my purposes. Others may feel different.
> Fundamentally, the idea of undefined behavior is at the core of the design
> of LLVM's optimizers. It is leveraged everywhere, and without it many
> algorithms that are fast would become slow, transformations that are cheap
> would become expensive, passes that operate locally would be forced to
> operate across ever growing scopes in order to be certain the optimizations
> applied in this specific case. Trying to remove undefined behavior from
> LLVM seems unlikely to be a productive pursuit.
>
Fair enough.
> More productive (IMO) is to emit explicit guards against the undefined
> behavior in your language, much as -fsanitize does for undefined behavior
> in C++. Then work to build a mode where a specific target can take
> advantage of target specific trapping behaviors to emit these guards more
> efficiently. This will allow LLVM's optimizers to continue to function in
> the world they were designed for, and with a set of rules that we know how
> to build efficient optimizers around, and your source programs can operate
> in a world with checked behavior rather than undefined behavior. As a
> useful side-effect, you can defer the target-specific optimizations until
> you have benchmarks (internally is fine!) and can demonstrate the
> performance problems (if any).
>
Regrettably, this implementation does not suit my needs. The constant
folding would still occur and I would like to produce the actual division,
since the instruction is non-maskable on x86. Others may have a better use
for this implementation though, so I don't want to shoot the idea down for
everyone.
> Cameron, you may disagree, but honestly if you were to convince folks here
> I think it would have happened already. I'm not likely to continue the
> theoretical debate about whether LLVM's stance on UB (as I've described
> above) is a "good" or "bad" stance. Not that I wouldn't enjoy the debate
> (especially at a bar some time), but because I fear it isn't a productive
> way to spend the time of folks on this list. So let's try to stick to the
> technical discussion of strategies, costs, and tradeoffs.
>
Oh, no. Your analysis was thorough and I can sympathize with it. The seams
of C/C++ and the x86 architecture are foggy. I understand that my
interpretation of their interactions is not gospel.
Thanks again for the thoughtful reply!
-Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130408/b4aa72d9/attachment.html>
More information about the llvm-dev
mailing list