[cfe-dev] [RFC] Re-use OpenCL address space attributes for SYCL

Bader, Alexey via cfe-dev cfe-dev at lists.llvm.org
Fri Jun 26 05:04:59 PDT 2020


Hi,

We would like to re-use OpenCL address space attributes for SYCL to target
SPIR-V format and enable efficient memory access on GPUs.

```c++
  __attribute__((opencl_global))
  __attribute__((opencl_local))
  __attribute__((opencl_private))
```

The first patch enabling conversion between pointers annotated with OpenCL
address space attribute and "default" pointers is being reviewed here
https://reviews.llvm.org/D80932.

Before moving further with the implementation we would like to discuss two
questions raised in review comments (https://reviews.llvm.org/D80932#2085848).

## Using attributes to annotate memory allocations

Introduction section of SYCL-1.2.1 specification describes multiple compilation
flows intended by the design:

> SYCL is designed to allow a compilation flow where the source file is passed
> through multiple different compilers, including a standard C++ host compiler
> of the developer's choice, and where the resulting application combines the
> results of these compilation passes. This is distinct from a single-source
> flow that might use language extensions that preclude the use of a standard
> host compiler. The SYCL standard does not preclude the use of a single
> compiler flow, but is designed to not require it.
>
> The advantages of this design are two-fold. First, it offers better
> integration with existing tool chains. An application that already builds
> using a chosen compiler can continue to do so when SYCL code is added. Using
> the SYCL tools on a source file within a project will both compile for an
> OpenCL device and let the same source file be compiled using the same host
> compiler that the rest of the project is compiled with. Linking and library
> relationships are unaffected. This design simplifies porting of pre-existing
> applications to SYCL. Second, the design allows the optimal compiler to be
> chosen for each device where different vendors may provide optimized
> tool-chains.
>
> SYCL is designed to be as close to standard C++ as possible. In practice,
> this means that as long as no dependence is created on SYCL's integration
> with OpenCL, a standard C++ compiler can compile the SYCL programs and they
> will run correctly on host CPU. Any use of specialized low-level features
> can be masked using the C preprocessor in the same way that
> compiler-specific intrinsics may be hidden to ensure portability between
> different host compilers.

Following this approach, SYCL uses C++ templates to represent pointers to
disjoint memory regions on an accelerator to enable compilation with standard
C++ toolchain and SYCL compiler toolchain.

For instance:

```c++
// CPU/host implementation
template <typename T, address_space AS> class multi_ptr {
  T *data; // ignore address space parameter on CPU
  public:
  T *get_pointer() { return data; }
}

// check that SYCL mode is ON and we can use non-standard annotations
#if defined(__SYCL_DEVICE_ONLY__)
// GPU/accelerator implementation
template <typename T, address_space AS> class multi_ptr {
  // GetAnnotatedPointer<T, global>::type == "__attribute__((opencl_global)) T"
  using pointer_t = typename GetAnnotatedPointer<T, AS>::type *;

  pointer_t data;
  public:
  pointer_t get_pointer() { return data; }
}
#endif
```

User can use `multi_ptr` class as regular user-defined type in regular C++ code:

```c++
int *UserFunc(multi_ptr<int, global> ptr) {
  /// ...
  return ptr.get_pointer();
}
```

Depending on the compiler mode `multi_ptr` will either annotate internal data
with address space attribute or not.

## Implementation details

OpenCL attributes are handled by Parser in all modes. OpenCL mode has specific
logic in Sema and CodeGen components for these attributes.

SYCL compiler re-use generic support for these attributes as is and modifies
Sema and CodeGen libraries. The main difference with OpenCL mode is that SYCL
mode (similar to other single-source GPU programming modes like OpenMP/CUDA/HIP)
keeps "default" address space for the declaration without address space
attribute annotations. This keeps the code shared between the host and device
semantically-correct for both compilers: regular C++ host compiler and SYCL
compiler.

To make all pointers without an explicit address space qualifier to be pointers
in generic address space, we updated SPIR target address space map, which
currently maps default pointers to "private" address space. We made this change
specific to SYCL by adding SYCL environment component to the Triple to avoid
impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to
see get a feedback from the community if changing this mapping is applicable for
all the modes and additional specialization can be avoided (e.g.
[AMDGPU](https://github.com/llvm/llvm-project/blob/master/clang/lib/Basic/Targets/AMDGPU.cpp#L329)
maps default to "generic" address space with a couple of exceptions).

There are a few cases when CodeGen assigns non-default address space:

1. For declaration explicitly annotated with address space attribute
2. Variables with static storage duration and string literals are allocated in
   global address space unless specific address space it specified.
3. Variables with automatic storage durations are allocated in private address
   space. It's current compiler behavior and it doesn't require additional
   changes.

For (2) and (3) cases, once "default" pointer to such variable is obtained, it
is immediately addrspacecast'ed to generic, because a user does not (and should
not) specify address space for pointers in source code.

A draft patch containing complete change-set is available
[here](https://github.com/bader/llvm/pull/18/).

Does this approach seem reasonable?

Thanks,
Alexey


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200626/fd4ef8ba/attachment-0001.html>


More information about the cfe-dev mailing list