[clang] [Clang][HIP][CUDA] Add `__cluster_dims__` and `__no_cluster__` attribute (PR #156686)
Erich Keane via cfe-commits
cfe-commits at lists.llvm.org
Tue Sep 30 12:20:07 PDT 2025
================
@@ -707,6 +707,38 @@ static void instantiateDependentAMDGPUMaxNumWorkGroupsAttr(
S.AMDGPU().addAMDGPUMaxNumWorkGroupsAttr(New, Attr, XExpr, YExpr, ZExpr);
}
+static void instantiateDependentCUDAClusterDimsAttr(
+ Sema &S, const MultiLevelTemplateArgumentList &TemplateArgs,
+ const CUDAClusterDimsAttr &Attr, Decl *New) {
+ EnterExpressionEvaluationContext Unevaluated(
+ S, Sema::ExpressionEvaluationContext::ConstantEvaluated);
+
+ Expr *XExpr = nullptr;
+ Expr *YExpr = nullptr;
+ Expr *ZExpr = nullptr;
+
+ if (Attr.getX()) {
+ ExprResult ResultX = S.SubstExpr(Attr.getX(), TemplateArgs);
+ if (ResultX.isUsable())
+ XExpr = ResultX.getAs<Expr>();
+ }
+
+ if (Attr.getY()) {
+ ExprResult ResultY = S.SubstExpr(Attr.getY(), TemplateArgs);
+ if (ResultY.isUsable())
+ YExpr = ResultY.getAs<Expr>();
+ }
+
+ if (Attr.getZ()) {
+ ExprResult ResultZ = S.SubstExpr(Attr.getZ(), TemplateArgs);
+ if (ResultZ.isUsable())
+ ZExpr = ResultZ.getAs<Expr>();
+ }
+
+ if (XExpr)
----------------
erichkeane wrote:
Is this quite right? The logic/behavior here is awkward, we end up creating the expression ONLY if just the `X` value is correct, but errors in `Y` and `Z` instantiation continue, whereas our initial creation of the attribute with dependent types only succeeds in adding the attribute if all 3 are successfully handled.
You'd see this with a partial instantiation, you could have this attribute fail 3x on instantiations as they go right-to-left, but only 1x if the 'x' fails immediately.
IMO, we should either always create this, or only create it if NONE of the above succeed.
ALSO-ALSO-- Is there value in instantiating ALL 3 even if the 1st fails?
https://github.com/llvm/llvm-project/pull/156686
More information about the cfe-commits
mailing list