[clang] [llvm] [AArch64] Attempt to further split the arch default and implied exts. (PR #106304)

Ahmed Bougacha via cfe-commits cfe-commits at lists.llvm.org
Tue Aug 27 16:10:51 PDT 2024


https://github.com/ahmedbougacha created https://github.com/llvm/llvm-project/pull/106304

In the tablegen definitions, we have now split the lists of extensions enabled by default for an arch version from the features that are required.

However, the frontend (in `AArch64TargetInfo::setFeatureEnabled`) still ends up bringing in all the default-enabled extensions anyway, so concretely we still end up with SVE2 for M4, now that it's declared as being v9.2a.

This takes the implied vs. default split to its conclusion, by emitting the two lists separately in tablegen, and really only enabling the really-required extensions.  The default-enabled extensions are still enabled by default when explicitly specifying an arch version (via -march/target(arch=)).
Of course, what's "really required" is messy, and in some cases even contradictory, so I'm not sure this is all that practical overall.

This isn't a PR I intend to merge (or at the very least definitely not as-is), but I'm curious to see what others think.  My conclusion is that we don't really want this after all, and since we don't expose the arch version directly (anymore) anyway, I'm thinking of downgrading the apple- definitions to v8a, explicitly listing the features, and discouraging or disallowing -march for darwin targets.

See individual commits:

    [AArch64] Remove not-always-required features from arch implied.

    These features aren't always required for the given base architecture
    version, and usually are only conditionally required when some other
    extension is present (e.g., FP/AdvSIMD).

    This breaks a *ton* of tests (and maybe user code too), because most
    places that just ask for an arch feature (without going through the
    now-default-ext-aware -march/arch= machinery) now miss out on a lot.

    We'd also have to add all the missing features to existing CPUs
    that now lose them.  I haven't done that here yet.

    Some of the features seem wrong in the Arm ARM, e.g., BF16 or I8MM
    being declared as mandatory as of e.g., v8.6a, even though they really
    need NEON, and NEON isn't mandatory.

    [AArch64] Further split the arch default and implied exts.

    We started doing this at the tablegen level, but the generated
    arch info only contained the list of default extensions.
    So in the frontend, we end up enabling all the default extensions
    for arch vN anyway, when we only asked for a CPU that happens to
    be declared as supporting vN.

    This tries to split that into distinct lists, using the implied
    subtarget features (and isolating only the AEK subset of those)
    most of the time, but also using the default extensions when
    explicitly asking for an arch (via -march or target(arch=)).

    [clang] Don't enable SVE2 for v9.0a in the frontend.

    We did this change in the backend, and we did it in the driver.
    But the frontend continued to do it, and still does it even after this
    change, because it picks up the DefaultExts (rather than the implied
    required features).

    This is a first step, at least.

    Ideally we wouldn't need the frontend to do this too, but for functions
    that have a target attribute, we can't rely on the driver, and we do
    want the target to be visible at the frontend level, so we can't rely on
    the backend either.  So we have to do this here as well.

>From 3d5059cadce915fbaaeccaeadb2755199ccb9ac6 Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha <ahmed at bougacha.org>
Date: Tue, 27 Aug 2024 15:30:23 -0700
Subject: [PATCH 1/3] [clang] Don't enable SVE2 for v9.0a in the frontend.

We did this change in the backend, and we did it in the driver.
But the frontend continued to do it, and still does it even after this
change, because it picks up the DefaultExts (rather than the implied
required features).

This is a first step, at least.

Ideally we wouldn't need the frontend to do this too, but for functions
that have a target attribute, we can't rely on the driver, and we do
want the target to be visible at the frontend level, so we can't rely on
the backend either.  So we have to do this here as well.
---
 clang/lib/Basic/Targets/AArch64.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 63fc15f916c558..19ad28d512005e 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -108,8 +108,6 @@ void AArch64TargetInfo::setArchFeatures() {
       HasBFloat16 = true;
       HasMatMul = true;
     }
-    FPU |= SveMode;
-    HasSVE2 = true;
     HasFullFP16 = true;
     HasAlternativeNZCV = true;
     HasFRInt3264 = true;

>From 653b318c08d0e061e4a72250dba1658699ec4058 Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha <ahmed at bougacha.org>
Date: Tue, 27 Aug 2024 15:33:43 -0700
Subject: [PATCH 2/3] [AArch64] Further split the arch default and implied
 exts.

We started doing this at the tablegen level, but the generated
arch info only contained the list of default extensions.
So in the frontend, we end up enabling all the default extensions
for arch vN anyway, when we only asked for a CPU that happens to
be declared as supporting vN.

This tries to split that into distinct lists, using the implied
subtarget features (and isolating only the AEK subset of those)
most of the time, but also using the default extensions when
explicitly asking for an arch (via -march or target(arch=)).
---
 clang/lib/Basic/Targets/AArch64.cpp            |  6 +++---
 clang/lib/Driver/ToolChains/Arch/AArch64.cpp   |  4 ++--
 .../llvm/TargetParser/AArch64TargetParser.h    | 18 +++++++++++++-----
 .../AArch64/AsmParser/AArch64AsmParser.cpp     |  2 +-
 llvm/lib/TargetParser/AArch64TargetParser.cpp  | 16 ++++++++++++----
 .../TargetParser/TargetParserTest.cpp          |  4 ++--
 llvm/utils/TableGen/ARMTargetDefEmitter.cpp    | 14 +++++++++++++-
 7 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 19ad28d512005e..e32e4a11aff75f 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -798,7 +798,7 @@ void AArch64TargetInfo::setFeatureEnabled(llvm::StringMap<bool> &Features,
 
   // Set any features implied by the architecture
   std::vector<StringRef> CPUFeats;
-  if (llvm::AArch64::getExtensionFeatures(ArchInfo->DefaultExts, CPUFeats)) {
+  if (llvm::AArch64::getExtensionFeatures(ArchInfo->ImpliedExts, CPUFeats)) {
     for (auto F : CPUFeats) {
       assert(F[0] == '+' && "Expected + in target feature!");
       Features[F.drop_front(1)] = true;
@@ -1180,7 +1180,7 @@ ParsedTargetAttr AArch64TargetInfo::parseTargetAttr(StringRef Features) const {
       // Ret.Features.
       if (!AI)
         continue;
-      FeatureBits.addArchDefaults(*AI);
+      FeatureBits.addArchFeatures(*AI);
       // Add any extra features, after the +
       SplitAndAddFeatures(Split.second, Ret.Features, FeatureBits);
     } else if (Feature.starts_with("cpu=")) {
@@ -1193,7 +1193,7 @@ ParsedTargetAttr AArch64TargetInfo::parseTargetAttr(StringRef Features) const {
             Feature.split("=").second.trim().split("+");
         Ret.CPU = Split.first;
         if (auto CpuInfo = llvm::AArch64::parseCpu(Ret.CPU)) {
-          FeatureBits.addCPUDefaults(*CpuInfo);
+          FeatureBits.addCPUFeatures(*CpuInfo);
           SplitAndAddFeatures(Split.second, Ret.Features, FeatureBits);
         }
       }
diff --git a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
index f083e40df13144..5e80f434baba27 100644
--- a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -103,7 +103,7 @@ static bool DecodeAArch64Mcpu(const Driver &D, StringRef Mcpu, StringRef &CPU,
   if (!CpuInfo)
     return false;
 
-  Extensions.addCPUDefaults(*CpuInfo);
+  Extensions.addCPUFeatures(*CpuInfo);
 
   if (Split.second.size() &&
       !DecodeAArch64Features(D, Split.second, Extensions))
@@ -126,7 +126,7 @@ getAArch64ArchFeaturesFromMarch(const Driver &D, StringRef March,
   if (!ArchInfo)
     return false;
 
-  Extensions.addArchDefaults(*ArchInfo);
+  Extensions.addArchDefaultFeatures(*ArchInfo);
 
   if ((Split.second.size() &&
        !DecodeAArch64Features(D, Split.second, Extensions)))
diff --git a/llvm/include/llvm/TargetParser/AArch64TargetParser.h b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
index 2169e9c94b61f3..adbc2ceb791858 100644
--- a/llvm/include/llvm/TargetParser/AArch64TargetParser.h
+++ b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
@@ -106,7 +106,9 @@ struct ArchInfo {
   StringRef Name;        // Name as supplied to -march e.g. "armv8.1-a"
   StringRef ArchFeature; // Name as supplied to -target-feature, e.g. "+v8a"
   AArch64::ExtensionBitset
-      DefaultExts; // bitfield of default extensions ArchExtKind
+      ImpliedExts; // bitfield of implied extensions ArchExtKind
+  AArch64::ExtensionBitset
+      DefaultExts; // bitfield of default-enabled extensions ArchExtKind
 
   bool operator==(const ArchInfo &Other) const {
     return this->Name == Other.Name;
@@ -194,13 +196,19 @@ struct ExtensionSet {
   // arcitecture versions.
   void disable(ArchExtKind E);
 
-  // Add default extensions for the given CPU. Records the base architecture,
+  // Add implied extensions for the given CPU. Records the base architecture,
   // to later resolve dependencies which depend on it.
-  void addCPUDefaults(const CpuInfo &CPU);
+  void addCPUFeatures(const CpuInfo &CPU);
 
-  // Add default extensions for the given architecture version. Records the
+  // Add implied extensions for the given architecture version. Records the
   // base architecture, to later resolve dependencies which depend on it.
-  void addArchDefaults(const ArchInfo &Arch);
+  void addArchFeatures(const ArchInfo &Arch);
+
+  // Add default-enabled extensions for the given architecture version.
+  // Similar to addArchFeatures, additionally enabling features that aren't
+  // implied (because they're required by the base architecture), but are
+  // nonetheless enabled by default when explicitly asking for the given arch.
+  void addArchDefaultFeatures(const ArchInfo &Arch);
 
   // Add or remove a feature based on a modifier string. The string must be of
   // the form "<name>" to enable a feature or "no<name>" to disable it. This
diff --git a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
index 37add682b150e7..1173d2406d003a 100644
--- a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+++ b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
@@ -6966,7 +6966,7 @@ bool AArch64AsmParser::parseDirectiveArch(SMLoc L) {
   // Get the architecture and extension features.
   std::vector<StringRef> AArch64Features;
   AArch64Features.push_back(ArchInfo->ArchFeature);
-  AArch64::getExtensionFeatures(ArchInfo->DefaultExts, AArch64Features);
+  AArch64::getExtensionFeatures(ArchInfo->ImpliedExts, AArch64Features);
 
   MCSubtargetInfo &STI = copySTI();
   std::vector<std::string> ArchFeatures(AArch64Features.begin(), AArch64Features.end());
diff --git a/llvm/lib/TargetParser/AArch64TargetParser.cpp b/llvm/lib/TargetParser/AArch64TargetParser.cpp
index 9fc7201efac6e2..261a59153bf41e 100644
--- a/llvm/lib/TargetParser/AArch64TargetParser.cpp
+++ b/llvm/lib/TargetParser/AArch64TargetParser.cpp
@@ -267,8 +267,8 @@ void AArch64::ExtensionSet::disable(ArchExtKind E) {
       disable(Dep.Later);
 }
 
-void AArch64::ExtensionSet::addCPUDefaults(const CpuInfo &CPU) {
-  LLVM_DEBUG(llvm::dbgs() << "addCPUDefaults(" << CPU.Name << ")\n");
+void AArch64::ExtensionSet::addCPUFeatures(const CpuInfo &CPU) {
+  LLVM_DEBUG(llvm::dbgs() << "addCPUFeatures(" << CPU.Name << ")\n");
   BaseArch = &CPU.Arch;
 
   AArch64::ExtensionBitset CPUExtensions = CPU.getImpliedExtensions();
@@ -277,10 +277,18 @@ void AArch64::ExtensionSet::addCPUDefaults(const CpuInfo &CPU) {
       enable(E.ID);
 }
 
-void AArch64::ExtensionSet::addArchDefaults(const ArchInfo &Arch) {
-  LLVM_DEBUG(llvm::dbgs() << "addArchDefaults(" << Arch.Name << ")\n");
+void AArch64::ExtensionSet::addArchFeatures(const ArchInfo &Arch) {
+  LLVM_DEBUG(llvm::dbgs() << "addArchFeatures(" << Arch.Name << ")\n");
   BaseArch = &Arch;
 
+  for (const auto &E : Extensions)
+    if (Arch.ImpliedExts.test(E.ID))
+      enable(E.ID);
+}
+
+void AArch64::ExtensionSet::addArchDefaultFeatures(const ArchInfo &Arch) {
+  LLVM_DEBUG(llvm::dbgs() << "addArchDefaultFeatures(" << Arch.Name << ")\n");
+
   for (const auto &E : Extensions)
     if (Arch.DefaultExts.test(E.ID))
       enable(E.ID);
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 7d999b826252a2..5ffe90f68e3c3d 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -1608,7 +1608,7 @@ TEST_P(AArch64ExtensionDependenciesBaseArchTestFixture,
   auto Params = GetParam();
 
   llvm::AArch64::ExtensionSet Extensions;
-  Extensions.addArchDefaults(Params.Arch);
+  Extensions.addArchFeatures(Params.Arch);
   for (auto M : Params.Modifiers) {
     bool success = Extensions.parseModifier(M);
     EXPECT_TRUE(success);
@@ -1642,7 +1642,7 @@ TEST_P(AArch64ExtensionDependenciesBaseCPUTestFixture,
   const std::optional<llvm::AArch64::CpuInfo> CPU =
       llvm::AArch64::parseCpu(Params.CPUName);
   EXPECT_TRUE(CPU);
-  Extensions.addCPUDefaults(*CPU);
+  Extensions.addCPUFeatures(*CPU);
   for (auto M : Params.Modifiers) {
     bool success = Extensions.parseModifier(M);
     EXPECT_TRUE(success);
diff --git a/llvm/utils/TableGen/ARMTargetDefEmitter.cpp b/llvm/utils/TableGen/ARMTargetDefEmitter.cpp
index 71ca331461c00c..d69d3f60ea0ec6 100644
--- a/llvm/utils/TableGen/ARMTargetDefEmitter.cpp
+++ b/llvm/utils/TableGen/ARMTargetDefEmitter.cpp
@@ -222,7 +222,19 @@ static void EmitARMTargetDef(RecordKeeper &RK, raw_ostream &OS) {
     const auto TargetFeatureName = Rec->getValueAsString("Name");
     OS << "  \"+" << TargetFeatureName << "\",\n";
 
-    // Construct the list of default extensions
+    // Construct the list of implied extensions, out of the list of implied
+    // Subtarget Features.
+    std::set<Record *> ImpliedFeats;
+    for (auto *F : Rec->getValueAsListOfDefs("Implies"))
+      CollectImpliedFeatures(ImpliedFeats, F);
+
+    OS << "  (AArch64::ExtensionBitset({";
+    for (auto *F : ImpliedFeats)
+      if (auto AEK = F->getValueAsOptionalString("ArchExtKindSpelling"))
+        OS << "AArch64::" << AEK->upper() << ", ";
+    OS << "})),\n";
+
+    // Construct the list of default-enabled extensions
     OS << "  (AArch64::ExtensionBitset({";
     for (auto *E : Rec->getValueAsListOfDefs("DefaultExts")) {
       OS << "AArch64::" << E->getValueAsString("ArchExtKindSpelling").upper()

>From 21f2eb30135980ce68f86fd94c0d415fe29af6f5 Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha <ahmed at bougacha.org>
Date: Tue, 27 Aug 2024 15:36:51 -0700
Subject: [PATCH 3/3] [AArch64] Remove not-always-required features from arch
 implied.

These features aren't always required for the given base architecture
version, and usually are only conditionally required when some other
extension is present (e.g., FP/AdvSIMD).

This breaks a *ton* of tests (and maybe user code too), because most
places that just ask for an arch feature (without going through the
now-default-ext-aware -march/arch= machinery) now miss out on a lot.

We'd also have to add all the missing features to existing CPUs
that now lose them.  I haven't done that here yet.

Some of the features seem wrong in the Arm ARM, e.g., BF16 or I8MM
being declared as mandatory as of e.g., v8.6a, even though they really
need NEON, and NEON isn't mandatory.
---
 llvm/lib/Target/AArch64/AArch64Features.td | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index 69d6b02fefffe9..92d553f09bb7a8 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -772,29 +772,28 @@ def HasV8_0aOps : Architecture64<8, 0, "a", "v8a",
   [FeatureEL2VMSA, FeatureEL3],
   [FeatureFPARMv8, FeatureNEON]>;
 def HasV8_1aOps : Architecture64<8, 1, "a", "v8.1a",
-  [HasV8_0aOps, FeatureCRC, FeatureLSE, FeatureRDM, FeaturePAN, FeatureLOR,
-   FeatureVH],
+  [HasV8_0aOps, FeatureCRC, FeatureLSE, FeaturePAN, FeatureLOR, FeatureVH],
   !listconcat(HasV8_0aOps.DefaultExts, [FeatureCRC, FeatureLSE, FeatureRDM])>;
 def HasV8_2aOps : Architecture64<8, 2, "a", "v8.2a",
   [HasV8_1aOps, FeaturePsUAO, FeaturePAN_RWV, FeatureRAS, FeatureCCPP],
   !listconcat(HasV8_1aOps.DefaultExts, [FeatureRAS])>;
 def HasV8_3aOps : Architecture64<8, 3, "a", "v8.3a",
-  [HasV8_2aOps, FeatureRCPC, FeaturePAuth, FeatureJS, FeatureComplxNum],
+  [HasV8_2aOps, FeatureRCPC, FeaturePAuth],
   !listconcat(HasV8_2aOps.DefaultExts, [FeatureComplxNum, FeatureJS,
     FeaturePAuth, FeatureRCPC,  FeatureCCIDX])>;
 def HasV8_4aOps : Architecture64<8, 4, "a", "v8.4a",
-  [HasV8_3aOps, FeatureDotProd, FeatureNV, FeatureMPAM, FeatureDIT,
-    FeatureTRACEV8_4, FeatureAM, FeatureSEL2, FeatureTLB_RMI, FeatureFlagM,
-    FeatureRCPC_IMMO, FeatureLSE2],
+  [HasV8_3aOps, FeatureNV, FeatureMPAM, FeatureDIT, FeatureTRACEV8_4,
+    FeatureAM, FeatureSEL2, FeatureTLB_RMI, FeatureFlagM, FeatureRCPC_IMMO,
+    FeatureLSE2],
   !listconcat(HasV8_3aOps.DefaultExts, [FeatureDotProd, FeatureDIT, FeatureFlagM])>;
 def HasV8_5aOps : Architecture64<8, 5, "a", "v8.5a",
-  [HasV8_4aOps, FeatureAltFPCmp, FeatureFRInt3264, FeatureSpecRestrict,
+  [HasV8_4aOps, FeatureAltFPCmp, FeatureSpecRestrict,
     FeatureSB, FeaturePredRes, FeatureCacheDeepPersist,
     FeatureBranchTargetId],
-  !listconcat(HasV8_4aOps.DefaultExts, [FeaturePredRes, FeatureSSBS, FeatureBranchTargetId, FeatureSB])>;
+  !listconcat(HasV8_4aOps.DefaultExts, [FeatureFRInt3264, FeaturePredRes, FeatureSSBS, FeatureBranchTargetId, FeatureSB])>;
 def HasV8_6aOps : Architecture64<8, 6, "a", "v8.6a",
-  [HasV8_5aOps, FeatureAMVS, FeatureBF16, FeatureFineGrainedTraps,
-    FeatureEnhancedCounterVirtualization, FeatureMatMulInt8],
+  [HasV8_5aOps, FeatureAMVS, FeatureFineGrainedTraps,
+    FeatureEnhancedCounterVirtualization],
   !listconcat(HasV8_5aOps.DefaultExts, [FeatureBF16, FeatureMatMulInt8])>;
 def HasV8_7aOps : Architecture64<8, 7, "a", "v8.7a",
   [HasV8_6aOps, FeatureXS, FeatureWFxT, FeatureHCX],



More information about the cfe-commits mailing list