[PATCH] D135451: [TTI] New PPC target hook enableUncondDivisionSpeculation

Fri Oct 7 14:32:43 PDT 2022

bmahjour added a comment.

In D135451#3843135 <https://reviews.llvm.org/D135451#3843135>, @efriedma wrote:

> In D135451#3843064 <https://reviews.llvm.org/D135451#3843064>, @alexgatea wrote:
>
>> In D135451#3842992 <https://reviews.llvm.org/D135451#3842992>, @fhahn wrote:
>>
>>> IIUC this proposal would effectively re-define `udiv` and `urem`'s semantics on the IR level to not have undefined behavior for PPC?
>>
>> I don't think that's quite correct. We still view them as undefined,
>
> division by zero is currently undefined behavior at the IR level; if your program would execute it, it has no meaning at all.  So hoisting a divide will interact badly with other optimizations; for example, instcombine will currently turn a divide by zero into "unreachable".  This is different from instructions that return poison.
>
> If you want a version of division that returns a poison value, you need to modify the semantics in LangRef.

As I understand it integer divide by zero is considered undefined behaviour in C while the same may not be true in other languages. Furthermore presence of side effects may be dependent on other configurations including target hardware (eg default treatment on Power vs x86). LLVM seems to distinguish the concept of "undefined behaviour" from "undefined value", treating the former more consequential than the latter. It currently treats div-by-zero as undefined behaviour, but that may be an over pessimistic treatment as demonstrated in this review. Baking assumptions about the source language or target hardware in the LLVM IR gets us into the situation were we have to sacrifice performance for some combinations to ensure functional correctness for others. To allow more flexibility we could either leave the IR neutral and let the optimizer decide based on config info (eg. TTI) or separate the undefined behaviour/value semantics from the udiv/urem instructions (eg. using an instruction flag). I think this revision takes the first approach and what you are suggesting is the second. I agree the second approach is cleaner and might be necessary given the historic assumptions made in this regard, although it would be a larger effort.

> In the original example, instead of trying to make the divide hoistable, you could teach LLVM to peel the first iteration of the loop, then CSE the divide.

I don't think that would work in general, for example if the loop had unknown bounds, because peeling in such cases would still require the peeled iteration to be conditionally executed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135451/new/

https://reviews.llvm.org/D135451