[clang] [Clang][HIP][CUDA] Add `__cluster_dims__` and `__no_cluster__` attribute (PR #156686)

Erich Keane via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 30 12:20:07 PDT 2025


================
@@ -707,6 +707,38 @@ static void instantiateDependentAMDGPUMaxNumWorkGroupsAttr(
     S.AMDGPU().addAMDGPUMaxNumWorkGroupsAttr(New, Attr, XExpr, YExpr, ZExpr);
 }
 
+static void instantiateDependentCUDAClusterDimsAttr(
+    Sema &S, const MultiLevelTemplateArgumentList &TemplateArgs,
+    const CUDAClusterDimsAttr &Attr, Decl *New) {
+  EnterExpressionEvaluationContext Unevaluated(
+      S, Sema::ExpressionEvaluationContext::ConstantEvaluated);
+
+  Expr *XExpr = nullptr;
+  Expr *YExpr = nullptr;
+  Expr *ZExpr = nullptr;
+
+  if (Attr.getX()) {
+    ExprResult ResultX = S.SubstExpr(Attr.getX(), TemplateArgs);
+    if (ResultX.isUsable())
+      XExpr = ResultX.getAs<Expr>();
+  }
+
+  if (Attr.getY()) {
+    ExprResult ResultY = S.SubstExpr(Attr.getY(), TemplateArgs);
+    if (ResultY.isUsable())
+      YExpr = ResultY.getAs<Expr>();
+  }
+
+  if (Attr.getZ()) {
+    ExprResult ResultZ = S.SubstExpr(Attr.getZ(), TemplateArgs);
+    if (ResultZ.isUsable())
+      ZExpr = ResultZ.getAs<Expr>();
+  }
+
+  if (XExpr)
----------------
erichkeane wrote:

Is this quite right?  The logic/behavior here is awkward, we end up creating the expression ONLY if just the `X` value is correct, but errors in `Y` and `Z` instantiation continue, whereas our initial creation of the attribute with dependent types only succeeds in adding the attribute if all 3 are successfully handled.

You'd see this with a partial instantiation, you could have this attribute fail 3x on instantiations as they go right-to-left, but only 1x if the 'x' fails immediately.

IMO, we should either always create this, or only create it if NONE of the above succeed.

ALSO-ALSO--  Is there value in instantiating ALL 3 even if the 1st fails? 

https://github.com/llvm/llvm-project/pull/156686


More information about the cfe-commits mailing list