[flang-commits] [flang] [flang][MLIR][OpenMP] Extend delayed privatization for allocatables (PR #84033)

Wed Mar 6 06:18:54 PST 2024

jeanPerier wrote:

I am not very familiar with the OpenMP outlining. Here are my quick thoughts:

- I do not get why it is a semantic issue to capture two values instead of one (I understand this may be an perf issue).
- It seems the new hlfir::FortranVariableShadow could simply be a fir::FortranVariableOpInterface (you would get everything you need by gathering the operation instead of the result: the attributes and all the results). The OpenMP privatization pass could attempt to always look at the producer of an mlir::Value that is captured. If it is fir::FortranVariableOpInterface, it would put the op in a some capture list, otherwise, it would capture the mlir::Value "as usual". Adding a new type in FIR is adding a lot of abstraction complexity.
- I agree with Kiran that if it is just a perf issue, the direct solution would be to update the cloning to use `#0`. Especially for allocatable and pointer were `#0` and `#1` are exactly the same anyway.

The story of `#0` vs `#1` is that `#0` could just be used anywhere from a correctness point of view. `#1` is only here to help skipping the creation of temporary descriptors when propagating assumed-shape through calls and doing "FIR level" descriptor manipulation (the issue of assumed-shape is that there lower bounds are local, so a new descriptor `#0` is created to hold the accurate lower bounds in the descriptors. But we do not care about propagating these lower bounds through calls since the entity will get new lower bounds inside the call anyway. FIR addressing operation was designed to not use the lower bound inside the descriptor (local lower bounds are provided with fir.shift if any)).

In the context of OpenMP outlining, if you need to capture a descriptor, it should be `#0` I think since it has all the info, because capturing `#1` would require to capture the lower bounds on top of that for any addressing operation inside the outlined code.

I think I may be able to modify converter.getSymbolAddress to use `#0` for Pointers/Allocatables since it is anyway the same as `#1`. But for assumed-shape, it will stay `#1` since fir::ExtendedValue are "FIR level" and should use `#1` to avoid descriptor temporaries when possible. The cloning code should be update to use HLFIR level cloning mechanism in that case (and maybe in call cases to make things simpler).

https://github.com/llvm/llvm-project/pull/84033