[llvm] [InstCombine] optimize powi(X,Y)/X with Ofast (PR #67236)

Mon Oct 9 14:13:49 PDT 2023

https://github.com/jcranmer-intel requested changes to this pull request.

`reassoc` is the general flag we've been using for `pow` combines, so that checks out.

Special case analysis:
- X is nonspecial, Y is INT_MIN -> result should be +/-0, Y wraparound produces +/-infinity instead
- X is +/-0, Y is INT_MIN -> result should be +/-infinity, Y wraparound produces +/-0 instead
- X is +/-inf, Y is INT_MIN -> result should be +/-0, Y wraparound produces +/-infinity instead
- X is NaN, Y is INT_MIN -> result should be NaN, wraparound produces NaN

- X is NaN, Y is not 1 -> result should be NaN, transform is correct
- X is NaN, Y is 1 -> result should be NaN, transform makes it 1 instead

- X is +/- 0, Y > 1 -> result should be NaN, transform makes it +/- 0 instead
- X is +/- 0, Y is 1 -> result should be NaN, transform makes it 1 instead
- X is +/- 0, Y is 0 -> result should be +/-inf, transform is correct
- X is +/- 0, Y < 0 -> result should be inf, transform is correct

- X is +/- inf, Y > 1 -> result should be NaN, transform makes it +/-inf instead
- X is +/- inf, Y is 1 -> result should be NaN, transform makes it 1 instead
- X is +/- inf, Y is 0 -> result should be +/-0, transform is correct
- X is +/- inf, Y < 0 -> result should be +/-0, transform is correct

(assuming `powi(X, 1)` is exactly `X` and `powi(X, -1)` is exactly `1.0/X`)
- X is nonspecial, Y is 1 -> result should be 1, transform is correct
- X is nonspecial, Y is 0 -> result should be 1/x, transform is correct

Ignoring the issue of INT_MIN - 1 wraparound, the cases where the transformation is incorrect is when the result would have been NaN (largely via intermediate 0.0/0.0 or infinity/infinity), so `nnan` is sufficient for that case. Taking into account the potential for wraparound, however, there is no set of fast-math flags that makes the transformation legal: `powi(2.0, INT_MIN)` is 0.0, and `0.0/2.0` is legally 0.0.

Thus, the transformation is illegal unless we can prove that `Y - 1` can't wraparound, in which case `reassoc` and `nnan` are the necessary flags.

https://github.com/llvm/llvm-project/pull/67236