[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 20 11:27:26 PDT 2021


tra created this revision.
tra added reviewers: jlebar, yaxunl, hliao.
Herald added subscribers: bixia, mgorny.
Herald added a reviewer: a.sidorin.
tra requested review of this revision.
Herald added a project: clang.

The patch Implements support for testure lookups (mostly) in a header file.

The patch has been tested on a source file with all possible combinations of argument types supported by CUDA headers, 
compiled and verified that the generated instructions and their parameters match the code generated by NVCC. 
Unfortunately, compiling texture code requires CUDA headers and can't be tested in clang itself. 
The test will need to be added to the test-suite later.

While generated code compiles and seems to match NVCC, I do not have any code that uses textures that I could test correctness of the implementation.

The gory details of the implementation follow.

------------------------------------

User-facing texture lookup API relies on NVCC's `__nv_tex_surf_handler` builtin which is actually a set of overloads. 
The catch is that it's overloaded not only by the argument types, but also by the value of the first argument.

Implementing it in the compiler itself would be rather messy as there are a lot of texture lookup variants.

Implementing texture lookups in C++ is somewhat more maintainable. 
If we could use string literals as a template parameter, the implementation could be done completely in the headers. 
Unfortunately, literal classes as template parameters are only available in C++20.

One alternative would be to use run-time dispatch, but, given that texture lookup is a single instruction, the overhead would be substantial-to-prohibitive.
As an alternative, this patch introduces `__nvvm_texture_op` builtin which maps known texture operations to an integer, which is then used to parametrize texture operations.

A lot of texture operations are fairly uniform, with the differences only in the instruction suffix. 
Unfortunately, inline assembly requires its input to be a string literal, so we can not rely on templates to generate it and have to resort to preprocessor to do the job.

Another quirk is that historically there were two ways to refer to a texture. 
Newer Api uses `cudaTextureObject_t` which is an opaque scalar value.
Older APIs were using an object of  `texture<>` type which was magically converted to an opaque texture handle (essentially the `cudaTextureObject_t`). 
There's no good way to do this conversion explicitly, which would require implementing each texture lookup twice, for each way to refer to a texture.
However, we can cheat a bit by introducing a dummy inline assembly. 
Nominally it accepts `texture<>` as input, but compiler will convert it to `cudaTextureObject_t`, so generated assembly will just return correct handle.
This allows both reference styles to use the same implementation.

Overall code structure :

- `struct __FT;` // maps texture data type to the 4-element texture fetch result type.
- `class __tex_fetch_v4<__op>; `// implements `run` methods for specific texture data types.
- `class __convert<DstT,SrcT>;` // converts result of __tex_fetch_v4 into expected return type (usually a smaller slice of 4-element fetch result
- `__tex_fetch<__op,...>();` // Calls appropriate `__convert(__text_fetch_v4()) variants.`
- `#define __nv_tex_surf_handler(__op, __ptr, ...) ;` calls appropriate __tex_fetch<>
- `__IMPL*` macros do the boilerplate generation of __tex_fetch_v4 variants.

  


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D110089

Files:
  clang/include/clang/Basic/Builtins.def
  clang/include/clang/Sema/Sema.h
  clang/lib/AST/ExprConstant.cpp
  clang/lib/Headers/CMakeLists.txt
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Headers/__clang_cuda_texture_intrinsics.h
  clang/lib/Sema/SemaChecking.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D110089.373650.patch
Type: text/x-patch
Size: 40469 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20210920/02f02e76/attachment-0001.bin>


More information about the cfe-commits mailing list