[clang] [RISCV][NO-MERGE] Discussions on passing tuning features from the Clang driver (PR #162716)
via cfe-commits
cfe-commits at lists.llvm.org
Thu Oct 9 11:50:13 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-risc-v
Author: Min-Yih Hsu (mshockwave)
<details>
<summary>Changes</summary>
Following up the discussions in the [RISC-V LLVM Syncup Meeting](https://docs.google.com/document/d/1G3ocHm2zE6AYTS2N3_3w2UxFnSEyKkcF57siLWe-NVs/edit?tab=t.0#heading=h.1jq17mylab94) on Oct 9, I'm summarizing in this PR the related backgrounds as well as concerns folks have raised during the meeting regarding the future direction of passing tuning features from the Clang driver. Feel free to use this PR as a discussion thread on this topic.
### More flexible RISC-V scheduling models
The original problem that motivated this whole body of work was that for a long time, if we wanted to share a single scheduling model among multiple CPUs that only differ in a small number of tuning features (e.g. whether it has fast vrgather), the only way was to create a base TableGen class that is subsequently parameterized and instantiated into a scheduling model for _each_ combination of the said tuning features. This TableGen-only solution doesn't scale in a sense that the number of scheduling models will explode upon adding more tuning features in the future.
I tried to tackle this problem by allowing scheduling model to be configured by subtargetfearue, via a new SchedPredicate called `FeatureSchedPredicate` (#<!-- -->161888 ). Such that we can create multiple CPU / tune CPU with only a _single_ scheduling model:
```
def Foo : RISCVTuneProcessorModel<"foo", FooModel>;
def FooWithFastVRGather : RISCVTuneProcessorModel<"foo-fast-vrgather", FooModel, [TuneFastVRGather]>;
```
The problem is, now CPU / tune CPU -- which maps to `-mcpu` and `-mtune` in the Clang driver, respectively -- become the one that doesn't scale, as we have to create a new `-mcpu` / `-mtune` for each combination of features & scheduling model.
### Ways of passing tuning features at this moment
The problem can boil down into finding a proper way to pass tuning features from the Clang driver (we're only discussing driver here as you can always pass feature to the frontend via `-target-feature`). As a starter, it's important to know what we're doing at this moment, which can be categorized into three ways:
- Retrieve tuning features from a CPU name. We currently encoded two unaligned-memory-related subtarget features directly into the `RISCVCPUInfo` data structure generated by RISC-V's TargetParser. This data structure is queried by driver which generates the corresponding `-target-feature` upon seeing a matching `-mcpu`.
- Explicitly translate some of the driver flags. Flags that fall into category include `-ffixed-<register name>` and `-mrelax`.
- Similar to the previous one, driver flag declarations that are included into the `m_riscv_Features_Group` will be automatically translated into the corresponding feature names based on their driver flag names. Currently only `-msave-restore` / `-mnosave-restore` use this.
It is also worth mentioning that RVI is considering adding _[Software Optimization Guidance Options](https://riscv.atlassian.net/wiki/spaces/TOXX/pages/585400366/Software+Optimization+Guidance+Options+Fast+Track+Approval+Request)_ -- a concept similar to a RISC-V extension (whose name starts with "O") but only provides uArch/implementation hints to the compiler rather than representing any architecture level contracts. Despite being closely related to what we're discussing here, these options/extensions will go into the _march_ string.
### Candidate solutions
We have roughly two candidate solutions on the table now:
#### New syntax for `-mtune`
Introducing an alternative syntax to `-mtune` in addition to the current one: `-mtune=<tune cpu name>:+x,-y,+z...` where `+x,-y,+z` is the feature string that will be converted into individual `-target-feature` flags. Code in this PR contains a prototype of this approach.
Pros:
- Scale better when there is a large number of tuning features in the future (no matter how many of them it'll always be a single flag)
Cons:
- Without any special handling, subtracting features with LLVM's current feature string design can be tricky (e.g. `-foo,+foo` will still give you `foo` but not when you write `+foo,-foo`)
- Alternative approach: we can forbid feature subtraction in the new `-mtune` string.
- It's more difficult to handle or reason about implied features or just interactions between different features. ARM/AArch64 ran into a similar problems before with a similar mechanism of using "+feature" in their march/mcpu string.
#### Toggle tuning features with the existing `-mfoo` mechanism
Feature `foo` will be controlled by `-mfoo` & `-mno-foo`.
Pros:
- Adding & subtracting features would be less of a problem compared to the previous `-mtune` approach, because the mechanism of `-mfoo` & `-mno-foo` is well defined.
- Hide the implementation at the driver level, including populating the implied features or even changing the actual subtarget feature names without breaking the user interface (i.e. driver).
Cons:
- Basically the opposite of the advantage from the previous `-mtune` approach: the number of `-m` flags will increase really fast (we need at least two flags per subtarget features) upon adding more tuning features in the future.
There is another challenge we have to address in both options: GCC compatibility. We have to talk with the GCC folks on this topic as we almost certain want a consistent interface and behavior across these two compilers.
I hope I captured everything we discussed in the meeting. Feel free to point out things that I missed.
---
Full diff: https://github.com/llvm/llvm-project/pull/162716.diff
4 Files Affected:
- (modified) clang/lib/Driver/ToolChains/Arch/RISCV.cpp (+28)
- (modified) clang/lib/Driver/ToolChains/Arch/RISCV.h (+3)
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+1-4)
- (modified) clang/test/Driver/riscv-cpus.c (+5)
``````````diff
diff --git a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
index f2e79e71f93d4..f51e3ea6bfe73 100644
--- a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
@@ -367,3 +367,31 @@ std::string riscv::getRISCVTargetCPU(const llvm::opt::ArgList &Args,
return Triple.isRISCV64() ? "generic-rv64" : "generic-rv32";
}
+
+void riscv::addMtuneWithFeatures(const Arg *A, const ArgList &Args,
+ ArgStringList &CmdArgs) {
+ // format: -mtune=<tune cpu>[:+a,-b,+c...]
+ StringRef MTune(A->getValue());
+ size_t ColonPos = MTune.find(':');
+ bool HasColon = ColonPos != StringRef::npos;
+
+ StringRef TuneCPU = MTune.take_front(ColonPos);
+ if (TuneCPU == "native")
+ CmdArgs.push_back(Args.MakeArgString(llvm::sys::getHostCPUName()));
+ else
+ CmdArgs.push_back(HasColon ? Args.MakeArgString(TuneCPU) : A->getValue());
+
+ StringRef FeatureStr = HasColon ? MTune.drop_front(ColonPos + 1) : "";
+ // TODO: Emit error
+ assert(!HasColon || !FeatureStr.empty());
+ if (FeatureStr.empty())
+ return;
+ StringRef Feature;
+ do {
+ std::tie(Feature, FeatureStr) = FeatureStr.split(',');
+ // TODO: Emit error
+ assert(Feature.starts_with('-') || Feature.starts_with('+'));
+ CmdArgs.push_back("-target-feature");
+ CmdArgs.push_back(Args.MakeArgString(Feature));
+ } while (!FeatureStr.empty());
+}
diff --git a/clang/lib/Driver/ToolChains/Arch/RISCV.h b/clang/lib/Driver/ToolChains/Arch/RISCV.h
index 388786b9c4c1f..3e43430051cca 100644
--- a/clang/lib/Driver/ToolChains/Arch/RISCV.h
+++ b/clang/lib/Driver/ToolChains/Arch/RISCV.h
@@ -28,6 +28,9 @@ std::string getRISCVArch(const llvm::opt::ArgList &Args,
const llvm::Triple &Triple);
std::string getRISCVTargetCPU(const llvm::opt::ArgList &Args,
const llvm::Triple &Triple);
+void addMtuneWithFeatures(const llvm::opt::Arg *A,
+ const llvm::opt::ArgList &Args,
+ llvm::opt::ArgStringList &CmdArgs);
} // end namespace riscv
} // namespace tools
} // end namespace driver
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index d326a81feb762..e39a8d4996089 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -2046,10 +2046,7 @@ void Clang::AddRISCVTargetArgs(const ArgList &Args,
if (const Arg *A = Args.getLastArg(options::OPT_mtune_EQ)) {
CmdArgs.push_back("-tune-cpu");
- if (strcmp(A->getValue(), "native") == 0)
- CmdArgs.push_back(Args.MakeArgString(llvm::sys::getHostCPUName()));
- else
- CmdArgs.push_back(A->getValue());
+ riscv::addMtuneWithFeatures(A, Args, CmdArgs);
}
// Handle -mrvv-vector-bits=<bits>
diff --git a/clang/test/Driver/riscv-cpus.c b/clang/test/Driver/riscv-cpus.c
index 5d5fdd72baedb..806c42bdad374 100644
--- a/clang/test/Driver/riscv-cpus.c
+++ b/clang/test/Driver/riscv-cpus.c
@@ -305,6 +305,11 @@
// RUN: %clang --target=riscv64 -### -c %s 2>&1 -mtune=native | FileCheck -check-prefix=MTUNE-NATIVE %s
// MTUNE-NATIVE-NOT: "-tune-cpu" "native"
+// Checking `-mtune=<tune cpu>:+x,-y,+z...`
+// RUN: %clang --target=riscv64 -### -c %s 2>&1 -mtune=sifive-e20:+xyz,-abc | FileCheck -check-prefix=MTUNE-WITH-FEATURES %s
+// MTUNE-WITH-FEATURES: "-tune-cpu" "sifive-e20"
+// MTUNE-WITH-FEATURES: "-target-feature" "+xyz" "-target-feature" "-abc"
+
// -mcpu with default -march
// RUN: %clang --target=riscv64 -### -c %s 2>&1 -mcpu=sifive-e20 | FileCheck -check-prefix=MCPU-SIFIVE-E20 %s
// MCPU-SIFIVE-E20: "-nostdsysteminc" "-target-cpu" "sifive-e20"
``````````
</details>
https://github.com/llvm/llvm-project/pull/162716
More information about the cfe-commits
mailing list