<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:2.0cm 42.5pt 2.0cm 3.0cm;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:446777191;
mso-list-template-ids:-129708036;}
@list l0:level1
{mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level2
{mso-level-tab-stop:72.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level3
{mso-level-tab-stop:108.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level4
{mso-level-tab-stop:144.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level5
{mso-level-tab-stop:180.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level6
{mso-level-tab-stop:216.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level7
{mso-level-tab-stop:252.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level8
{mso-level-tab-stop:288.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level9
{mso-level-tab-stop:324.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="RU" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">Hi,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">We would like to re-use OpenCL address space attributes for SYCL to target<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">SPIR-V format and enable efficient memory access on GPUs.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">```c++<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> __attribute__((opencl_global))<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> __attribute__((opencl_local))<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> __attribute__((opencl_private))<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">```<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">The first patch enabling conversion between pointers annotated with OpenCL<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">address space attribute and "default" pointers is being reviewed here<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">https://reviews.llvm.org/D80932.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Before moving further with the implementation we would like to discuss two<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">questions raised in review comments (https://reviews.llvm.org/D80932#2085848).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US">## Using attributes to annotate memory allocations</span></b><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Introduction section of SYCL-1.2.1 specification describes multiple compilation<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">flows intended by the design:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> SYCL is designed to allow a compilation flow where the source file is passed<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> through multiple different compilers, including a standard C++ host compiler<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> of the developer’s choice, and where the resulting application combines the<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> results of these compilation passes. This is distinct from a single-source<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> flow that might use language extensions that preclude the use of a standard<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> host compiler. The SYCL standard does not preclude the use of a single<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> compiler flow, but is designed to not require it.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> The advantages of this design are two-fold. First, it offers better<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> integration with existing tool chains. An application that already builds<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> using a chosen compiler can continue to do so when SYCL code is added. Using<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> the SYCL tools on a source file within a project will both compile for an<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> OpenCL device and let the same source file be compiled using the same host<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> compiler that the rest of the project is compiled with. Linking and library<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> relationships are unaffected. This design simplifies porting of pre-existing<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> applications to SYCL. Second, the design allows the optimal compiler to be<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> chosen for each device where different vendors may provide optimized<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> tool-chains.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> SYCL is designed to be as close to standard C++ as possible. In practice,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> this means that as long as no dependence is created on SYCL’s integration<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> with OpenCL, a standard C++ compiler can compile the SYCL programs and they<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> will run correctly on host CPU. Any use of specialized low-level features<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> can be masked using the C preprocessor in the same way that<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> compiler-specific intrinsics may be hidden to ensure portability between<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">> different host compilers.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Following this approach, SYCL uses C++ templates to represent pointers to<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">disjoint memory regions on an accelerator to enable compilation with standard<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">C++ toolchain and SYCL compiler toolchain.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">For instance:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">```c++<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">// CPU/host implementation<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">template <typename T, address_space AS> class multi_ptr {<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> T *data; // ignore address space parameter on CPU<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> public:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> T *get_pointer() { return data; }<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">}<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">// check that SYCL mode is ON and we can use non-standard annotations<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#if defined(__SYCL_DEVICE_ONLY__)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">// GPU/accelerator implementation<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">template <typename T, address_space AS> class multi_ptr {<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> // GetAnnotatedPointer<T, global>::type == "__attribute__((opencl_global)) T"<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> using pointer_t = typename GetAnnotatedPointer<T, AS>::type *;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> pointer_t data;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> public:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> pointer_t get_pointer() { return data; }<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">}<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">#endif<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">```<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">User can use `multi_ptr` class as regular user-defined type in regular C++ code:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">```c++<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">int *UserFunc(multi_ptr<int, global> ptr) {<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> /// ...<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> return ptr.get_pointer();<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">}<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">```<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Depending on the compiler mode `multi_ptr` will either annotate internal data<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">with address space attribute or not.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US">## Implementation details</span></b><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">OpenCL attributes are handled by Parser in all modes. OpenCL mode has specific<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">logic in Sema and CodeGen components for these attributes.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">SYCL compiler re-use generic support for these attributes as is and modifies<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Sema and CodeGen libraries. The main difference with OpenCL mode is that SYCL<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">mode (similar to other single-source GPU programming modes like OpenMP/CUDA/HIP)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">keeps "default" address space for the declaration without address space<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">attribute annotations. This keeps the code shared between the host and device<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">semantically-correct for both compilers: regular C++ host compiler and SYCL<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">compiler.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">To make all pointers without an explicit address space qualifier to be pointers<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">in generic address space, we updated SPIR target address space map, which<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">currently maps default pointers to "private" address space. We made this change<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">specific to SYCL by adding SYCL environment component to the Triple to avoid<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">see get a feedback from the community if changing this mapping is applicable for<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">all the modes and additional specialization can be avoided (e.g.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">[AMDGPU](<u>https://github.com/llvm/llvm-project/blob/master/clang/lib/Basic/Targets/AMDGPU.cpp#L329</u>)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">maps default to "generic" address space with a couple of exceptions).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">There are a few cases when CodeGen assigns non-default address space:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">1. For declaration explicitly annotated with address space attribute<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">2. Variables with static storage duration and string literals are allocated in<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> global address space unless specific address space it specified.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">3. Variables with automatic storage durations are allocated in private address<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> space. It's current compiler behavior and it doesn't require additional<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"> changes.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">For (2) and (3) cases, once "default" pointer to such variable is obtained, it<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">is immediately addrspacecast'ed to generic, because a user does not (and should<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">not) specify address space for pointers in source code.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">A draft patch containing complete change-set is available <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">[here](<u>https://github.com/bader/llvm/pull/18/</u>).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Does this approach seem reasonable?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal">Thanks,<o:p></o:p></p>
<p class="MsoNormal">Alexey<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
</body>
</html>