[llvm] [NVPTX] Add family-specific architectures support (PR #141899)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 12 08:12:24 PDT 2025


================
@@ -33,20 +33,66 @@ class FeaturePTX<int version>:
    SubtargetFeature<"ptx"# version, "PTXVersion",
                     "" # version,
                     "Use PTX version " # version>;
-
+// NVPTX Architecture Hierarchy and Ordering:
+//
+// GPU architectures: sm_2Y/sm_3Y/sm_5Y/sm_6Y/sm_7Y/sm_8Y/sm_9Y/sm_10Y/sm_12Y
+// ('Y' represents version within the architecture)
+// The architectures have name of form sm_XYz where 'X' represent the generation
+// number, 'Y' represents the version within the architecture, and 'z' represents
+// the optional feature suffix.
+// If X1Y1 <= X2Y2, then GPU capabilities of sm_X1Y1 are included in sm_X2Y2.
+// For example, take sm_90 (9 represents 'X', 0 represents 'Y', and no feature
+// suffix) and sm_103 architectures (10 represents 'X', 3 represents 'Y', and no
+// feature suffix). Since 90 <= 103, sm_90 is compatible with sm_103.
+//
+// The family-specific architectures have 'f' feature suffix and they follow
+// following order:
+// sm_X{Y2}f > sm_X{Y1}f iff Y2 > Y1
+// sm_XY{f} > sm_{XY}{}
+//
+// For example, take sm_100f (10 represents 'X', 0 represents 'Y', and 'f'
+// represents 'z') and sm_103f (10 represents 'X', 3 represents 'Y', and 'f'
+// represents 'z') architectures. Since Y1 < Y2, sm_100f is compatible with
+// sm_103f. Similarly based on the second rule, sm_90 is compatible with sm_103f.
----------------
AlexMaclean wrote:

I think it might be good to include some counter-examples here as well. Some of the cases where an architecture isn't compatible with another.

https://github.com/llvm/llvm-project/pull/141899


More information about the llvm-commits mailing list