[flang-commits] [flang] [RFC][flang] Add support for assumed-shape dummy arrays repacking. (PR #127147)
Kiran Chandramohan via flang-commits
flang-commits at lists.llvm.org
Sun Feb 16 02:38:44 PST 2025
kiranchandramohan wrote:
Thanks @vzakhari for the detailed document about repacking arrays.
We implemented a loop versioning pass (https://discourse.llvm.org/t/rfc-loop-versioning-for-unit-stride/68605, https://flang.llvm.org/docs/FlangCommandLineReference.html#cmdoption-flang-fversion-loops-for-stride) that versions the loop for better cache locality, prevent scatter/gathers etc. We get around 36% improvement in the spec2017 roms benchmark due to this pass. Since this pass serves a similar usecase as repack arrays, the pass can be switched OFF when 'repack-arrays' is ON.
BTW, Is the performance data that you collected with the loop versioning pass ON? Or is it purely artificial?
There were some comments about issues with versioning/repacking with the asynchronous attribute and the contiguous attribute. It might be good to add these points to the document. https://discourse.llvm.org/t/transformations-to-aid-optimizer-for-subroutines-functions-with-assumed-shape-arguments/66447
I have added AMD engineers working on offloading to have a look for the offloading questions.
https://github.com/llvm/llvm-project/pull/127147
More information about the flang-commits
mailing list