[flang-commits] [flang] [flang] Expand SUM(DIM=CONSTANT) into an hlfir.elemental. (PR #118556)

Slava Zakharin via flang-commits flang-commits at lists.llvm.org
Wed Dec 11 14:59:00 PST 2024


vzakhari wrote:

I was able to reproduce ~3% slowdown with the latest `flang-new -Ofast -mcpu=neoverse-v1 -mtune=neoverse-v1 -flto -fuse-ld=lld -mmlir -flang-simplify-hlfir-sum=true` on a grace machine.

The function specialization works same way with and without partial SUM inlining. The difference appears later after LLVM inlining. I suppose the size threshold is affected by the SUM inlining. I do not have any idea how to fix that, so I decided to experiment with further improving exchange2 performance.

I have two prorotype patches that improve InlineElementals and OptimizedBufferization passes.  Together with the total SUM reduction patch, these patches make exchange2 a little bit faster than with the current flang-new. I will prepare the two patches for reviews, and will flip the engineering switch.

https://github.com/llvm/llvm-project/pull/118556


More information about the flang-commits mailing list