[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v4 (PR #108636)

Wed Oct 16 16:50:47 PDT 2024

================
@@ -1205,10 +1298,19 @@ class LOAD32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, lis
 class LOADi32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, PatFrag OpNode>
     : LOAD32<SizeOp, ModOp, OpcodeStr, [(set i32:$dst, (OpNode ADDRri:$addr))]>;
 
+class LOAD_ACQUIREi32<BPFWidthModifer SizeOp, string OpcodeStr>
+    : LOAD_ACQUIRE<SizeOp, OpcodeStr, GPR32>;
+
 let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {
   def LDW32 : LOADi32<BPF_W, BPF_MEM, "u32", load>;
   def LDH32 : LOADi32<BPF_H, BPF_MEM, "u16", zextloadi16>;
   def LDB32 : LOADi32<BPF_B, BPF_MEM, "u8", zextloadi8>;
+
+  let Predicates = [BPFHasLoadAcquire] in {
----------------
peilin-ye wrote:

> Is there a reason to define these (and stores) behind `BPFHasALU32` flag?

Not really.  Looks like there is no way to turn off `HasALU32` for v4.  I just wanted to make it clear in the `.td` file that we are using half-registers for 8-, 16- and 32-bit insns.

Also I wanted to define them behind `DecoderNamespace = "BPFALU32"` because currently `BPFDisassembler::getInstruction()` uses `DecoderTableBPFALU3264` for all non-64-bit `BPF_STX | BPF_ATOMIC` insns:

```cpp
  if ((InstClass == BPF_LDX || InstClass == BPF_STX) &&
      getInstSize(Insn) != BPF_DW &&
      (InstMode == BPF_MEM || InstMode == BPF_ATOMIC) &&
      STI.hasFeature(BPF::ALU32))
    Result = decodeInstruction(DecoderTableBPFALU3264, Instr, Insn, Address,
                               this, STI);
  else
    Result = decodeInstruction(DecoderTableBPF64, Instr, Insn, Address, this,
                               STI);
```
So if I move them out of `HasALU32` and keep BPFDisassembler.cpp as-is, llvm-objdump would give me:
```
0000000000000000 <bar>:
;     __atomic_store_n(ptr, val, __ATOMIC_RELEASE);
       0:	cb 21 00 00 b0 00 00 00	<unknown>
; }
       1:	95 00 00 00 00 00 00 00	exit
```
- - -
> From instruction encoding pow we use STX class, which has no 32/64 sub-division.

What I'm doing is similar to `XXOR*`: right now we only have `XXORD` and `XXORW32` (both `BPF_STX`), and `XXORW32` is behind `Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32"`.  There is no "ALU64 version" of `XXORW32`.

https://github.com/llvm/llvm-project/pull/108636