[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr
Deepak Eachempati via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Sep 12 19:50:32 PDT 2022
dreachem added inline comments.
================
Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14
+ {
+ #pragma omp target data use_device_addr(x)
+ {
----------------
ye-luo wrote:
> doru1004 wrote:
> > doru1004 wrote:
> > > ye-luo wrote:
> > > > doru1004 wrote:
> > > > > ye-luo wrote:
> > > > > > doru1004 wrote:
> > > > > > > ye-luo wrote:
> > > > > > > > doru1004 wrote:
> > > > > > > > > ye-luo wrote:
> > > > > > > > > > In my understanding of the spec.
> > > > > > > > > > `map(tofrom:x[0:256])` only maps the memory segment that x points to. x itself as a pointer scalar is not mapped.
> > > > > > > > > > use_device_addr(x) should fail to find the map of x scalar.
> > > > > > > > > > 5.2 spec.
> > > > > > > > > > If the list item is not a mapped list item, it is assumed to be accessible on the target device.
> > > > > > > > > > To me, it seems just keep &x as it was, in this case &x remains a host address.
> > > > > > > > > >
> > > > > > > > > > But in your patch description, it seems treating x differently from a scalar.
> > > > > > > > > >
> > > > > > > > > > I also applied your patch on main and got segfault because the x has a value of device address and x[0] fails. This should be the behavior of use_device_ptr instead of use_device_addr.
> > > > > > > > > > To me, it seems just keep &x as it was, in this case &x remains a host address.
> > > > > > > > >
> > > > > > > > > So does this mean that if I do something like this in the target data I should get different addresses for x:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ```
> > > > > > > > > #pragma omp target data use_device_ptr(x)
> > > > > > > > > {
> > > > > > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > #pragma omp target data use_device_addr(x)
> > > > > > > > > {
> > > > > > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > > > > > }
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > I also applied your patch on main and got segfault because the x has a value of device address and x[0] fails.
> > > > > > > > >
> > > > > > > > > That's my fault x[0] was the wrong thing to use actually.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > When you have an outer `target data map(x)`, then two printf differ. If there is no outer `map(x)`, two printf should be identical.
> > > > > > > > When you have an outer `target data map(x)`, then two printf differ. If there is no outer `map(x)`, two printf should be identical.
> > > > > > >
> > > > > > > This is super helpful thank you! I'll make sure that happens.
> > > > > > >
> > > > > > > In the case when an outer target data exists, the print of the x which is under use_device_addr should print the same address as printing x on the host?
> > > > > > I need a correction. When outer map(x) exists, actually the address(not value) of x should be a device address, and the code cannot even print x. Printing &x should be fine.
> > > > > In the context of the above comment, should &x on the device be an address I can verify, somehow, to make sure that it's correct or is it a completely new device address?
> > > > >
> > > > > So for example, should it be the same as when I do a use_device_ptr but print the &x in that case? (With the current master those two addresses are not the same.)
> > > > >
> > > > > I guess what I need is an example of using use_device_addr that actually does something meaningful because with the current main branch printing the &x of a use_device_addr(x) is nil.
> > > > When an outer map(x) is placed, &x does print something meaningful.
> > > > I tried to access `omp_get_mapped_ptr(&x, omp_get_default_device())` but got link time error about missing omp_get_mapped_ptr definition. It seems missing an implementation of this OpenMP API.
> > > >
> > > > When there is no map(x), I also got nil, I think this is a bug, &x should keep the host value.
> > > >
> > > > I cannot think of a useful example with use_device_addr(x) where x is a pointer. But x can be a scalar float.
> > > > and then call cublas gemm, the alpha/beta parameter can be &x.
> > > > When an outer map(x) is placed, &x does print something meaningful.
> > >
> > > For me, in the same scenario, it prints nil.
> > >
> > > Here's the full example to avoid any confusion:
> > >
> > > ```
> > > float *x = (float *) malloc(10*sizeof(float));
> > >
> > > #pragma omp target data map(to:x[0:10])
> > > {
> > > #pragma omp target data use_device_ptr(x)
> > > {
> > > fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000
> > > }
> > >
> > > #pragma omp target data use_device_addr(x)
> > > {
> > > fprintf(stderr, "line %d x: %p\n", __LINE__, &x); // prints nil for me
> > > }
> > > }
> > > }
> > > ```
> > >
> > > Note that x has been mapped to the device in the following way:
> > > ```
> > > Libomptarget device 0 info: Host Ptr Target Ptr
> > > Libomptarget device 0 info: 0x00005578f14de020 0x00007f0bda400000
> > > ```
> > >
> > > What should the printed address be though?
> > >
> > > Note that the above results have been obtained with current Clang/LLVM main not with this patch applied.
> > >
> > > If you apply this patch and run the above code but change the &x into a x you get:
> > >
> > > ```
> > > float *x = (float *) malloc(10*sizeof(float));
> > >
> > > #pragma omp target data map(to:x[0:10])
> > > {
> > > #pragma omp target data use_device_ptr(x)
> > > {
> > > fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000
> > > }
> > >
> > > #pragma omp target data use_device_addr(x)
> > > {
> > > fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000
> > > }
> > > }
> > > }
> > > ```
> > > When an outer map(x) is placed, &x does print something meaningful.
> > > I tried to access `omp_get_mapped_ptr(&x, omp_get_default_device())` but got link time error about missing omp_get_mapped_ptr definition. It seems missing an implementation of this OpenMP API.
> > >
> > > When there is no map(x), I also got nil, I think this is a bug, &x should keep the host value.
> > >
> > > I cannot think of a useful example with use_device_addr(x) where x is a pointer. But x can be a scalar float.
> > > and then call cublas gemm, the alpha/beta parameter can be &x.
> >
> >
> When I said `map(x)` I meant exactly `#pragma omp target data map(to:x)` which is different from `#pragma omp target data map(to:x[0:10])`.
> The former maps the pointer scalar, the latter maps to the memory segment that x points to.
This is what should be used for that case:
```
#pragma omp target data map(x[0:256])
#pragma omp target data use_device_addr(x[0:256])
```
Array sections are permitted for `use_device_addr` to handle cases like this.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D133694/new/
https://reviews.llvm.org/D133694
More information about the cfe-commits
mailing list