[flang-commits] [flang] [Flang] - Add optional inlining of allocatable assignments with hlfir.expr RHS (PR #186880)
Slava Zakharin via flang-commits
flang-commits at lists.llvm.org
Tue Mar 17 08:33:55 PDT 2026
vzakhari wrote:
> The above is a reduced testcase from one of our internal benchmarks.
Thank you for the example. I am trying to understand where this enormous speed up is coming from. Can you please confirm that in both cases you have a temporary array created for `cos(a)` elemental operation? If it is the case, then does it mean that the library imlementation of `Assign` is much slower than the inlined code, and may it be the case that the library may be compiled "better" to reduce the gap?
Overall, I am okay with doing the inlining, thought I would think the library implementation may be faster in some cases (e.g. it may use just a single `memcpy` for the contiguous array of any rank vs potentially multiple `memcpy` that LLVM would generate for the N-level loop nest).
https://github.com/llvm/llvm-project/pull/186880
More information about the flang-commits
mailing list