[llvm] [NVPTX] Add family-specific architectures support (PR #141899)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu May 29 11:15:37 PDT 2025


================
@@ -36,17 +36,20 @@ class FeaturePTX<int version>:
 
 foreach sm = [20, 21, 30, 32, 35, 37, 50, 52, 53,
               60, 61, 62, 70, 72, 75, 80, 86, 87,
-              89, 90, 100, 101, 103, 120, 121] in
+              89, 90] in
   def SM#sm: FeatureSM<""#sm, !mul(sm, 10)>;
 
-// Arch-specific targets. PTX for these is not compatible with any other
-// architectures.
-def SM90a : FeatureSM<"90a", 901>;
-def SM100a: FeatureSM<"100a", 1001>;
-def SM101a: FeatureSM<"101a", 1011>;
-def SM103a: FeatureSM<"103a", 1031>;
-def SM120a: FeatureSM<"120a", 1201>;
-def SM121a: FeatureSM<"121a", 1211>;
+// Full SM version for sm_90a is 901
+def SM90a: FeatureSM<"90a", 901>;
+
+foreach sm = [100, 101, 103, 120, 121] in {
+  def SM#sm: FeatureSM<""#sm, !mul(sm, 10)>;
+  // Arch-specific targets. PTX for these is not compatible with any other
+  // architectures.
+  def SM#sm#a: FeatureSM<""#sm#"a", !add(!mul(sm, 10), 1)>;
+  // Family-specific targets. PTX for these is compatible within the same family.
+  def SM#sm#f: FeatureSM<""#sm#"f", !add(!mul(sm, 10), 2)>;
----------------
Artem-B wrote:

PTX docs say: `PTX for family-specific targets is compatible with all subsequent targets in same family.`

I'm not sure that I understand what it means.
E.g. how is sm_100 is different from sm_100f?  My understanding is that, as a baseline architecture sm_100 is supposed to be compatible with all subsequent targets in the family. At least that's how it worked so far.

Does `f` mean "subset of the architecture-specific features, normally available in `a` that will be compatible with subsequent targets in the same family" ? I.e they are only ordered within the major architecture, but are not comparable with other major architectures.

- `a` is the superset, not comparable with anything else
- `f` is a subset of `a`, superset of the plain  variant, and is partially ordered within the family, but not across families.
- un-suffixed variant is ordered across all un-suffixed variants.
- because `a` and `f` are supersets of their plain variants, they by extension are also supersets of all the older plain variants

We need to think how the numbering scheme will impact various predicates that generally assume that larger ones are supersets of previous ones. `a` threw a wrench into this, but we kind of special-cased it by treating it as the plain variant and explicitly checking if it's an `a` variant when needed.

Now that `f` introduces new kind of partial ordering that's not going to be sufficient.

I do not have a solution at the moment, but we will probably want to make `f` numbering linear (so we can compare them), and convert `a` into a flag. E.g. 
- sm_100  = 10000 
- sm_100f = 10010 
- sm_100a = 10011

Predicates that care about plain variant will operate on `N/100`. Those that care about `f` variants will use `N/10`, with additional constraint that the family (`N/1000`) is the same. whether it's an `a`  variant could be checked by `N%10`


https://github.com/llvm/llvm-project/pull/141899


More information about the llvm-commits mailing list