[clang] [CUDA][HIP] Fix host/device context in concept (PR #67721)

Yaxun Liu via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 17 19:55:13 PDT 2023


================
@@ -176,3 +176,34 @@ Predefined Macros
    * - ``HIP_API_PER_THREAD_DEFAULT_STREAM``
      - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated.
 
+C++20 Concepts with HIP and CUDA
+--------------------------------
+
+In Clang, when working with HIP or CUDA, it's important to note that all constraints in C++20 concepts are assumed to be for the host side only. This behavior is consistent across both programming models, and developers should be aware of this assumption when writing code that utilizes C++20 concepts.
+
+Example:
+.. code-block:: c++
+
+   template <class T>
+   concept MyConcept = requires(T& obj) {
+     my_function(obj);  // Assumed to be a host-side requirement
----------------
yxsamliu wrote:

For a typical use case of concept in CUDA programs, please see https://godbolt.org/z/o7Wa68n9c

This is taken from issue https://github.com/llvm/llvm-project/issues/67507.

In this example, users want to express two constraints on geometric_shape:

1. it can be passed to a function draw

2. it can be passed to a function area and the result is convertible to double

For the first constraint, users only need it on the host side. For the second constraint, users need it on both the host side and the device side. This gives us some insight into users' needs for constraints: they are usually different for host and device sides, since users may want to do different things on host and device sides. Therefore, assuming a constraint in a concept should be satisfied on both the device and host sides will result in some unnecessary extra constraints on either side.

Is it OK to evaluate the constraints by the context where the template is instantiated? For example, when we instantiate the kernel `template <geometric_shape T> __global__ void compute_areas`, can we evaluate the constraints in the device context to get what we need? It is not good. Because then the constraint about function draw needs to be satisfied on the device side. That is not what we need. The point is, that the constraints defined in a concept need to have individual required context. We want to be able to express that this constraint should be satisfied in the device context, and that constraint should be satisfied in the host context. That is why I propose to allow `__device__` and `__host__` attributes to be added to the call expressions in concepts to indicate the required context for an individual constraint.

Now that we have discussed the users' needs regarding device/host contexts of constraints. Let's look at how nvcc currently evaluates satisfaction of constraints.

Based on https://godbolt.org/z/o7Wa68n9c , the instantiation of `work<triangle>` is successful. We can infer that `triangle` satisfies the two constraints. They can only be evaluated in the host context since functions `draw` and `area` are all host functions. Even though the instantiation of `work<triangle>` is done in a device context, the evaluation of the constraints is still done in the host context.

The current patch matches nvcc's behaviour.

https://github.com/llvm/llvm-project/pull/67721


More information about the cfe-commits mailing list