[Mlir-commits] [flang] [llvm] [mlir] [Flang][MLIR][OpenMP] Allow setting OMP_MAP_PTR_AND_OBJ by frontends (PR #84328)

Fri Mar 8 10:52:36 PST 2024

vzakhari wrote:

> > Thank you for the changes!
> > You may ignore this, but I think this low-level naming coming from libomptarget implementation looks off for a higher level dialect. As I understand, you need to distinguish the pointer and non-pointer members of the aggregate so that you can later properly set `PTR_AND_OBJ`. So can we use something like `is_ptr_member` to make this difference explicit? It reads better to me from the language point of view, at least for C/C++. It also looks okay for Fortran, if we assume the underlying memory representation of the descriptor (which, I believe, will be the only parent whose members can have `is_ptr_member`).
> > Feel free to go with `PTR_AND_OBJ` though :)
> 
> That is indeed correct for at least one of the uses, I think (but perhaps I am incorrect) it's more generally used to indicate that something is a pointer that's pointing to some memory, so both should be mapped accordingly, feels a little out of place in Fortran but the descriptor base address and declare target link need it set, the latter because it undergoes some transformations to be a global pointer from my recollection of working on it as opposed to the original type itself having anything to do with being a pointer <-> pointee coupling.
> 
> As for the naming I agree it's not the best and I would be happy to change it if you wish to suggest another name as I think is_ptr_member may not quite cover all of it's uses. I am not so sure if is_ptr would be too generic, is_ptr_pointee?

`is_ptr_pointee` is better, but what if we go with just `pointer`?  The `varPtr` is always a pointer, so the `pointer` attribute on top of it would just indicate that the actual value pointed to by `varPtr` contains a pointer.  Within the context of the chained `map_info`, having the `pointer` attribute on a member `map_info` will always indicate that the pointee mapping and the pointer attachment have to be made, but it will also have the exact meaning, again, that `*varPtr` is a pointer within the parent object (whether it is a structure, descriptor or scalar pointer).  I wonder if there are any cases where we would not need to do the pointee mapping and the pointer attachment when we have a member `map_info` for a pointer entity of a parent.  If not, then `pointer` just sounds right to me.

Taking a step back, why the semantics that you try to represent with this new attribute is not represented just by the fact that `varPtrPtr` is present on the member `map_info` - can you please explain?

My first paragraph above is sort of off when I am looking at the current `map_info` representation for the members.  Sorry that I missed the point in time when we started using the box reference as `varPtr` operand to represent the actual data mapping.  I found this comment in #71766:
```
Why utilise the original symbol (descriptor and pointer to data) as opposed to the BaseAddr (pointer to data)?

Currently (and probably always, as I think the future specification iteration requests descriptor mapping to some extent, but I might be misunderstanding the change set) the default lowering of the kernel expects the descriptor as well as the underlying data to be mapped, the generated kernel directly accesses the descriptor for calculations such as indexing.

The main symbol/variable being accessed inside of the kernel is also the original descriptor, and there's a portion of kernel lowering for device where we need the original descriptor SSA/global used inside of the kernel so that we can remap all uses of it to the newly generated kernel argument (or derived accessor instructions to this kernel argument) that corresponds
to it. Which works perfectly fine if we map the original symbol, but if we map the BaseAddr, which is a BoxAddrOf operation to the BoxAddr it becomes a little trickier as we can't traceback to the original symbol (global or local allocation) we need, as every BoxAddrOf operation and a lot of other box related operations (e.g. box/embox) currently generates a new allocation
and the map_info that previously pointed to the BaseAddr linked to the original descriptor instruction (global, allocatable or function argument) no longer points to it, but points to one of the many allocations created during lowering, and then we can no longer rewrite it's kernel uses.

When testing with a mix of target + enter/exit, the alloca SSA each map_info was tracking was also completely different due to each BoxAddrOf generating it's own alloca, this could also prove to be a bit of an issue when mapping data across.

There might be a way around this I can't think of, but it seems we need to map the descriptor rather than the BaseAddr (internal pointer of the descriptor) currently.
```

I am not sure what complications you had with tracing back of the original symbol, why the tracing back is needed, and why the BoxAddrOf operations would product new allocations.  Can you please explain?

Sorry if I missed the turn, but I was under impression that we agreed on `varPtr` being exactly the address of the memory to map with the bounds provided by the `bounds`, the parent-children relationship provided by the chained `map_info`, and the pointer attachment semantics represented by `varPtrPtr` (of course, on top of the pointer-pointee semantics provided, again, by the changed `map_info`).

https://github.com/llvm/llvm-project/pull/84328