[llvm] [LAA] Support different strides & non constant dep distances using SCEV. (PR #88039)
Slava Zakharin via llvm-commits
llvm-commits at lists.llvm.org
Thu May 2 08:55:55 PDT 2024
vzakhari wrote:
Regarding `rnflow` benchmark: yes, this change causes a loop vectorization with a memcheck. The loop contains an FP division, which is being rewritten with Newton-Raphson division approximation with RCP in ISel. The approximation uses muls/adds/subs, and ISel encodes them as FMAs. This altogether causes the accuracy change that affects the benchmark results.
Without this change, the loop is not vectorized and the scalar FP division is not selected into Newton-Raphson approximation. Gfortran does not vectorize the loop and also uses divss for the division.
It looks like the benchmark is sensitive to FP computations accuracy (that eventually affects even the integer results of the benchmark). Disabling FMA generation resolves the accuracy issue.
https://github.com/llvm/llvm-project/pull/88039
More information about the llvm-commits
mailing list