[flang-commits] [flang] [Flang] - Add optional inlining of allocatable assignments with hlfir.expr RHS (PR #186880)
Pranav Bhandarkar via flang-commits
flang-commits at lists.llvm.org
Wed Mar 18 07:53:23 PDT 2026
bhandarkar-pranav wrote:
> > The above is a reduced testcase from one of our internal benchmarks.
>
> Thank you for the example. I am trying to understand where this enormous speed up is coming from. Can you please confirm that in both cases you have a temporary array created for `cos(a)` elemental operation? If it is the case, then does it mean that the library imlementation of `Assign` is much slower than the inlined code, and may it be the case that the library may be compiled "better" to reduce the gap?
I am sorry I do not fully understand this. The `hlfir.elemental` should not have a temporary array associated with it (because it produces an `hlfir.expr`) until after lowering to FIR right? I profiled the passes and `OpenMPOpt` is the pass that blows up in time during LTO whenever `__FortranAAssign` is called (and it's body is pulled in from the runtime library). This is my output when I add `-time-passes` to clang-linker-wrapper
```
===-------------------------------------------------------------------------===
Pass execution timing report
===-------------------------------------------------------------------------===
Total Execution Time: 21.7081 seconds (21.7052 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
6.5107 ( 30.8%) 0.0688 ( 11.5%) 6.5795 ( 30.3%) 6.5799 ( 30.3%) OpenMPOptPass
2.1780 ( 10.3%) 0.0010 ( 0.2%) 2.1790 ( 10.0%) 2.1791 ( 10.0%) OpenMPOptCGSCCPass
1.7045 ( 8.1%) 0.2698 ( 45.2%) 1.9743 ( 9.1%) 1.9744 ( 9.1%) AMDGPU DAG->DAG Pattern Instruction Selection
```
> Overall, I am okay with doing the inlining, thought I would think the library implementation may be faster in some cases (e.g. it may use just a single `memcpy` for the contiguous array of any rank vs potentially multiple `memcpy` that LLVM would generate for the N-level loop nest).
I agree with this concern, which is part of the reason why I have guarded this with a flag.
https://github.com/llvm/llvm-project/pull/186880
More information about the flang-commits
mailing list