[Openmp-commits] [flang] [mlir] [openmp] [Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices (PR #71766)

Wed Nov 15 07:11:23 PST 2023

agozillon wrote:

> > However, for 2) it's quite possible to use the BoundsOp for a lot of the calculations as they directly access the descriptor fields, the only issue I've found is that there is no element type size field for the BoundsOp (as far as I'm aware at least). You can use getStride() (when getStrideInBytes is true, which is the case currently for allocatables), however, I opted to not do so as I imagine the value returned will change when a user specifies a stride, although, @razvanlupusoru can likely clarify if it would!
> 
> BoundsOp provides a consistent representation which allows both user specified dimensions and implicit dimensions (aka dimensions which would have been encoded in the type are now explicit). So yes, the BoundsOp will indeed change if a user specifies a stride.
> 
> The BoundsOp does not encode element size - because this is a property of the data being mapped. However, there is one leaky abstraction for the BoundsOp - it allows stride in bytes in some cases. This is because it is possible for the stride to not be divisible by element size in Fortran. Consider the example below which has a descriptor with element size of 8 but a stride of 12:
> 
> ```
> program slice
>   type tt
>     complex(4) :: aa
>     real(4) :: bb
>   end type
>   type(tt), dimension(2) :: var
>   call map_var(var%aa)
> contains
>   subroutine map_var(var)
>     complex(4) :: var(:)
>     !$omp target map(tofrom:var)
>     var(1) = 100
>     !$omp end target
>   end subroutine
> end program
> ```
> 
> The current BoundsOp allows representing this case without forcing the data to be contiguous.
> 
> When a non-1 stride is specified by user, this means that the mapped data is no longer contiguous - but OpenMP allows this (OpenMP 5.2, section 3.2.5). I don't know if the offload runtime accepts not-contiguous data. However, to me, it seems that using BoundsOp stride directly still allows computing total size correctly. What I mean by that, is consider the C example of an array with 100 elements being mapped like `map(tofrom:array[0:50:2])`. The total size is computable. By taking the formula from OpenMP 3.2.5, it notes: `length = (size − lower-bound)/stride`. So we have 50 = (size - 0) / 2. Thus size = 100. Size in bytes is 100 * sizeof(elemtype).
> 
> I agree with Kiran's observation that it seems inconsistent to use the BoundsOp in some cases but not in others. IMO, the BoundsOp + MapInfo should have enough information to complete the mapping. Namely, the MapInfo holds the element type - which at the LLVM level should be mapable to byte size.
> 
> The reason the BoundsOp does not encode size is that not all types have a size representation at the FIR level until they get materialized to LLVM Dialect (fir.box type for example). Thus it was not added because one of the goals of acc dialect was to be agnostic regardless of dialect it is used with and my initial thoughts are that often high level abstractions capture this information via type not bytes. However, I wouldn't be against allowing BoundsOp to also capture element size abstraction - by taking an additional parameter which would use a dialect specific operation, maybe `fir.typesize` - which could produce a size based on type. This would need to be reviewed with the Fortran team.
> 
> On the point about the acc dialect being agnostic with the dialect it is used with - gets me to the second point about adding is_fortran_allocatable flag to MapInfo. :) I don't know what omp dialect goals are in this regard but I personally have some opinions on this aspect. Anyway, I will need to respond to this point when I make some time next - likely by end of week. You have a very large description on things considered which I should understand before I add my 2 cents.

Thank you @razvanlupusoru for the input. I don't believe the offload runtime currently accepts non-contiguous data, but it's good to see that this case has been thought about thoroughly. I've not looked too deeply into the stride calculations yet, but if it is possible then I'm happy to do it that way. I think the main issue with the current method would still be that we don't have the element type size without going through the descriptor, which still seems to be needed in the calculation (but I might be misunderstanding it). But mapping both (descriptor and data), as mentioned in your prior comment and previously in the thread, would solve this.

https://github.com/llvm/llvm-project/pull/71766