[PATCH] D135451: [TTI] New PPC target hook enableUncondDivisionSpeculation

Fri Oct 7 15:16:12 PDT 2022

efriedma added a comment.

> As I understand it integer divide by zero is considered undefined behaviour in C while the same may not be true in other languages. Furthermore presence of side effects may be dependent on other configurations including target hardware (eg default treatment on Power vs x86).

The LLVM IR rule is more driven by the behavior of the instructions on various targets. If a target only has a trapping divide, we'd need to wrap it in control flow to implement a non-trapping divide.  And particularly for signed divide, that check isn't cheap. We tend to prefer poison where it makes sense (for example, out-of-bounds shifts).

Frontends can always use control flow to get whatever user-visible behavior they want.  (For example, the Rust divide operator panics if you divide by zero.)

> To allow more flexibility we could either leave the IR neutral and let the optimizer decide based on config info (eg. TTI)

We try to avoid making core IR semantics depend on TTI.  Not that we can completely ignore target differences when writing IR optimizations, but we want to keep IR understandable without reference to target-specific semantics.

I mean, it would be self-consistent to write in LangRef something like "whether division by zero is undefined behavior, or defined to produce a poison value, depends on the current target/current subtarget/bitwith of the operation/current moon cycle".  But I don't want to go there.  If the rules are the same across all targets, it's easier to understand, and easier to implement tools like Alive2 to validate transforms.

>> In the original example, instead of trying to make the divide hoistable, you could teach LLVM to peel the first iteration of the loop, then CSE the divide.
>
> I don't think that would work in general, for example if the loop had unknown bounds, because peeling in such cases would still require the peeled iteration to be conditionally executed.

Any loop can be peeled (as long as the body doesn't contain some exotic construct that inhibits cloning); it's basically just cloning the loop body.  And if the divide dominates the latch before peeling, it will dominate the peeled loop after peeling.

The "general" problem is really the case where peeling is too expensive.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135451/new/

https://reviews.llvm.org/D135451