[PATCH] D117450: [LoopInterchange] Support loop interchange with floating point reductions
Congzhe Cao via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 18 21:04:39 PST 2022
congzhe added inline comments.
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:900
+ Instruction *I = dyn_cast<Instruction>(followLCSSA(&PHI));
+ if (!F->getFnAttribute("unsafe-fp-math").getValueAsBool() ||
+ I == nullptr || !I->isFast())
----------------
Meinersbur wrote:
> I couldn't find a reference for `unsafe-fp-math`, but I would have guessed that it forces all instructions to be fast-math, even if not marked as fast. If already marked as fast, then no additional tag `unsafe-fp-math` should be necessary.
>
> Instead of `isFast`, would `allowReassoc` be sufficient (or whatever flag controls commutativity)?
>
> The logic here will check only one fp instruction. What if two instructions are involved? Such as:
> ```
> for (...)
> for (...) {
> sum += A[i];
> sum += B[j];
> }
> print(sum);
> ```
Thanks a lot for the review! I updated the patch accordingly.
Instead of using `isFast()`, I now use `hasAllowReassoc()`.
Instead of only checking the variable obtained from `followLCSSA(&PHI)`, now I added a simple data flow analysis `areAllInstsReassoc()` that checks whether all insturctions involved in the FP reduction allow reassociation. Hopefully this is sufficient for the two-fp-instruction case you provided (and cases where multiple fp instructions are involved).
Regarding "unsafe-fp-math" VS "fast": thank you and I see your point. And yes, I did see cases where the function has the "unsafe-fp-math" attribute but its instructions do not (please see example 1 below). Nevertheless I did also see test cases where instructions are marked "fast" but the surrounding function does not have the "unsafe-fp-math" attribute (please see example 2 below). It seems like there is some minor inconsistency between the attribute and the flag, so I just wanted to be conservatively correct by checking both -- if we have the attribute then that's fine, otherwise we continue checking instruction flags. However, if you think the logic can be simpilfied I'll be glad to update the code.
Example 1 (clang/test/CodeGen/fp-options-to-fast-math-flags.c):
Note that the `fast` flag is not generated even if compiled with `-ffast-math`.
```
float test(float a) {
return a + fn(a);
}
// CHECK-FAST: [[CALL_RES:%.+]] = call reassoc nnan ninf nsz arcp afn float @fn(float noundef {{%.+}})
// CHECK-FAST: {{%.+}} = fadd reassoc nnan ninf nsz arcp afn float {{%.+}}, [[CALL_RES]]
```
Example 2 (`llvm/test/Transforms/InstSimplify/ConstProp/math-1.ll`):
```
define double @f_acos() {
; CHECK-LABEL: @f_acos(
; CHECK-NEXT: ret double 0.000000e+00
;
%res = tail call fast double @acos(double 1.0)
ret double %res
}
```
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117450/new/
https://reviews.llvm.org/D117450
More information about the llvm-commits
mailing list