[llvm] [NVPTX] Add family-specific architectures support (PR #141899)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 12 08:12:23 PDT 2025


================
@@ -33,20 +33,66 @@ class FeaturePTX<int version>:
    SubtargetFeature<"ptx"# version, "PTXVersion",
                     "" # version,
                     "Use PTX version " # version>;
-
+// NVPTX Architecture Hierarchy and Ordering:
+//
+// GPU architectures: sm_2Y/sm_3Y/sm_5Y/sm_6Y/sm_7Y/sm_8Y/sm_9Y/sm_10Y/sm_12Y
+// ('Y' represents version within the architecture)
+// The architectures have name of form sm_XYz where 'X' represent the generation
+// number, 'Y' represents the version within the architecture, and 'z' represents
+// the optional feature suffix.
+// If X1Y1 <= X2Y2, then GPU capabilities of sm_X1Y1 are included in sm_X2Y2.
+// For example, take sm_90 (9 represents 'X', 0 represents 'Y', and no feature
+// suffix) and sm_103 architectures (10 represents 'X', 3 represents 'Y', and no
+// feature suffix). Since 90 <= 103, sm_90 is compatible with sm_103.
+//
+// The family-specific architectures have 'f' feature suffix and they follow
+// following order:
+// sm_X{Y2}f > sm_X{Y1}f iff Y2 > Y1
+// sm_XY{f} > sm_{XY}{}
+//
+// For example, take sm_100f (10 represents 'X', 0 represents 'Y', and 'f'
+// represents 'z') and sm_103f (10 represents 'X', 3 represents 'Y', and 'f'
+// represents 'z') architectures. Since Y1 < Y2, sm_100f is compatible with
+// sm_103f. Similarly based on the second rule, sm_90 is compatible with sm_103f.
+//
+// The architecture-specific architectures have 'a' feature suffix and they follow
+// following order:
+// sm_XY{a} > sm_XY{f} > sm_{XY}{}
+//
+// For example, take sm_103a (10 represents 'X', 3 represents 'Y', and 'a'
+// represents 'z'), sm_103f, and sm_103 architectures. The sm_103 is compatible
+// with sm_103a and sm_103f, and sm_103f is compatible with sm_103a.
+//
+// Encoding := Arch * 100 + 10 (for 'f') + 1 (for 'a')
+// Arch := X * 10 + Y
+//
+// For example, sm_103a is encoded as 10311 (103 * 100 + 10 + 1) and sm_103f is
+// encoded as 10310 (103 * 100 + 10).
+//
+// This encoding allows simple partial ordering of the architectures.
+//  + Compare Family and Arch by dividing FullSMVersion by 1000 and 100
+//    respectively before the comparison.
+//  + Compare within the family by comparing FullSMVersion, given both belongs to
+//    the same family.
+//  + Detect 'a' variants by checking FullSMVersion % 10.
 foreach sm = [20, 21, 30, 32, 35, 37, 50, 52, 53,
               60, 61, 62, 70, 72, 75, 80, 86, 87,
-              89, 90, 100, 101, 103, 120, 121] in
-  def SM#sm: FeatureSM<""#sm, !mul(sm, 10)>;
+              89, 90, 100, 101, 103, 120, 121] in {
+  // Base SM version (e.g. FullSMVersion for sm_100 is 10000)
+  def SM#sm : FeatureSM<""#sm, !mul(sm, 100)>;
 
-// Arch-specific targets. PTX for these is not compatible with any other
-// architectures.
-def SM90a : FeatureSM<"90a", 901>;
-def SM100a: FeatureSM<"100a", 1001>;
-def SM101a: FeatureSM<"101a", 1011>;
-def SM103a: FeatureSM<"103a", 1031>;
-def SM120a: FeatureSM<"120a", 1201>;
-def SM121a: FeatureSM<"121a", 1211>;
+  // Family-specific targets which are compatible within same family
+  // (e.g. FullSMVersion for sm_100f is 10010)
+  if !ge(sm, 100) then {
----------------
AlexMaclean wrote:

Nit: remove `{}`

https://github.com/llvm/llvm-project/pull/141899


More information about the llvm-commits mailing list