[llvm] [RISCV] 'Zalrsc' may permit non-base instructions (PR #165042)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 27 17:31:02 PDT 2025
================
@@ -1906,6 +1906,25 @@ def FeatureForcedAtomics : SubtargetFeature<
def HasAtomicLdSt
: Predicate<"Subtarget->hasStdExtZalrsc() || Subtarget->hasForcedAtomics()">;
+// The RISC-V Unprivileged Architecture - ISA Volume 1 (Version: 20250508)
+// [https://docs.riscv.org/reference/isa/_attachments/riscv-unprivileged.pdf]
+// in section 13.3. Eventual Success of Store-Conditional Instructions, defines
+// _constrained_ LR/SC loops:
+// The dynamic code executed between the LR and SC instructions can only
+// contain instructions from the base ''I'' instruction set, excluding loads,
+// stores, backward jumps, taken backward branches, JALR, FENCE, and SYSTEM
+// instructions. Compressed forms of the aforementioned ''I'' instructions in
+// the Zca and Zcb extensions are also permitted.
+// LR/SC loops that do not adhere to the above are _unconstrained_ LR/SC loops,
+// and success is implementation specific. For implementations which know that
+// non-base instructions (such as the ''B'' extension) will not violate any
+// forward progress guarantees, using these instructions to reduce the LR/SC
+// sequence length is desirable.
+def FeaturePermissiveZalrsc
+ : SubtargetFeature<
+ "permissive-zalrsc", "HasPermissiveZalrsc", "true",
+ "Implementation permits non-base instructions between LR/SC pairs">;
----------------
slachowsky wrote:
Certainly a reasonable ask.
This feature is from the point of view of a minimal RISC-V core with LR/SC, and a global monitor that is external to the core. In such a configuration the global monitor is aware only of the load/store transactions to the memory system, and completely unaware of what instructions or control flow occurred on the CPU(s) (or non-CPU devices) to generate those transactions. Any instruction mix is permissible in this style of system (ignoring higher order concerns of guaranteed forward progress / eventual success), as long as the same memory transactions present to the monitor.
It is necessary to have some `FeaturePermissiveZalrsc` control to enable 'unconstrained' LR/SC loops, and the proposal here is there are _no constraints_ on what is permissible. The idea is to admit shorter sequences via checks on extant secondary extension feature availability:
```
if (STI->hasPermissiveZalrsc() && STI->hasVendorExtABC())
// build short vendor ABC instruction sequence
else if (STI->hasPermissiveZalrsc() && STI->hasStdExtXYZ())
// build short standard XYZ instruction sequence
else
// build original constrained sequence with only 'I' instructions
```
This avoids the explosion in features to cover the cross-products of permitted Zalrsc x {XYZ, ABC, etc}. If a core has no constraints on what is permitted, and it also has an instruction extension that gives a shorter sequence go ahead and use it.
Realistically though, there is a tiny vocabulary of `atomicrmw <ops>`, and the existing pseudo expansions for these are very tightly coded, so there is very limited opportunity for improvement here. Other than Zbb MIN/MAX instructions in this patch, the only other instruction extension that I can think of that has utility is some sort of bit field insertion / bit select instructions that could shorten the `xor` + `and` + `xor` sequence used in the masked atomics.
https://github.com/llvm/llvm-project/pull/165042
More information about the llvm-commits
mailing list