[llvm] [NVPTX] Add family-specific architectures support (PR #141899)

Rajat Bajpai via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 9 10:56:38 PDT 2025


================
@@ -127,15 +127,25 @@ class NVPTXSubtarget : public NVPTXGenSubtargetInfo {
   bool hasPTXASUnreachableBug() const { return PTXVersion < 83; }
   bool hasCvtaParam() const { return SmVersion >= 70 && PTXVersion >= 77; }
   unsigned int getFullSmVersion() const { return FullSmVersion; }
-  unsigned int getSmVersion() const { return getFullSmVersion() / 10; }
+  unsigned int getSmVersion() const { return getFullSmVersion() / 100; }
   // GPUs with "a" suffix have include architecture-accelerated features that
   // are supported on the specified architecture only, hence such targets do not
   // follow the onion layer model. hasArchAccelFeatures() allows
   // distinguishing such GPU variants from the base GPU architecture.
-  // - 0 represents base GPU model,
-  // - non-zero value identifies particular architecture-accelerated variant.
-  bool hasArchAccelFeatures() const { return getFullSmVersion() % 10; }
-
+  // - false represents non-accelerated architecture.
+  // - true represents architecture-accelerated variant.
+  bool hasArchAccelFeatures() const {
+    return getFullSmVersion() % 10 && PTXVersion >= 80;
+  }
+  // GPUs with 'f' suffix have architecture-accelerated features which are
+  // portable across all future architectures under same SM major. For example,
+  // sm_100f features will work for sm_10X*f*/sm_10X*a* future architectures.
+  // - false represents non-family-specific architecture.
+  // - true represents family-specific variant.
+  bool hasFamilySpecificFeatures() const {
+    return getFullSmVersion() % 100 == 10 ? PTXVersion >= 88
+                                          : hasArchAccelFeatures();
----------------
rajatbajpai wrote:

I believe predicates checks the PTX requirement for the particular intrinsic. However, these I added to say family-specific arch is available from PTX 8.8 onwards and arch-accelerated arch is available from PTX 8.0 onwards. Let me know what do you think.

https://github.com/llvm/llvm-project/pull/141899


More information about the llvm-commits mailing list