[PATCH] D135451: [TTI] New PPC target hook enableUncondDivisionSpeculation

Tue Oct 11 08:21:37 PDT 2022

fhahn added a comment.

In D135451#3849630 <https://reviews.llvm.org/D135451#3849630>, @bmahjour wrote:

> In D135451#3844084 <https://reviews.llvm.org/D135451#3844084>, @efriedma wrote:
>
>> I mean, it would be self-consistent to write in LangRef something like "whether division by zero is undefined behavior, or defined to produce a poison value, depends on the current target/current subtarget/bitwith of the operation/current moon cycle".  But I don't want to go there.  If the rules are the same across all targets, it's easier to understand, and easier to implement tools like Alive2 to validate transforms.
>
> I'm sorry but I don't quite follow the logic that tries to justify the current design. On the one hand the IR rules are driven by target requirements, and at the same time you avoid making IR semantics dependent on TTI (which is the proper way to query about target requirements). They seem like recipes for the exact problem we are dealing with here, ie there are target-specific assumptions baked into the IR that are not configurable.

A practical consequence of changing the semantics to be target-dependent is that you would also have to audit each piece of code the reasons about `udiv` to make sure it properly considers both semantics.

TTI is at the moment only used for queries that inform cost decisions AFAICT. Changing it to impact semantics of LLVM IR would be a big change IMO.

>>> I don't think that would work in general, for example if the loop had unknown bounds, because peeling in such cases would still require the peeled iteration to be conditionally executed.
>>
>> Any loop can be peeled (as long as the body doesn't contain some exotic construct that inhibits cloning); it's basically just cloning the loop body.  And if the divide dominates the latch before peeling, it will dominate the peeled loop after peeling.
>>
>> The "general" problem is really the case where peeling is too expensive.
>
> consider this loop:
>
>   for (i = 0; i < n; i++) {
>     v += x/y;
>   }
>
> after peeling:
>
>   if (n > 0) {
>     v += x/y;
>   }
>   for (i = 1; i < n; i++) {
>     v += x/y;
>   }
>
> The peeled divide that is guarded by the `if (n > 0)` conditional does not dominate the divide that's in the loop body. Even if we try to consider control flow equivalence between that guard and the loop guard, there could still be cases where the dominance cannot be safely determined (eg. non-affine loops).

I'm not sure I follow here. The loop peeling implementation in LLVM should make sure that the remainder loop is dominated by the peeled iterations. For the motivating example, peeling by 1 iteration seems to have the desired effect: https://clang.godbolt.org/z/rhvsdvEjG

Peeling is used in a similar fashion to turn loads in a loop into dereferenceable loads, see D108114 <https://reviews.llvm.org/D108114>.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135451/new/

https://reviews.llvm.org/D135451