[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v5 (PR #108636)

Tue Oct 1 15:04:33 PDT 2024

yonghong-song wrote:

@peilin-ye Sorry for late review. A few general observations below:

1. I think -mcpu=v5 can be avoided. We really want to accumulate quite some insns before increasing cpu version. Otherwise, we may quickly reach v7/v8/... which people will lose check which version supports what. We might need to use feature flags later (e.g. for x86, -msse, -mavx, etc.). So far, I suggest to stick to cpu v4.

2. I tried below example:
```
$ cat t4.c
short foo(short *ptr) {
  return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
}
$ clang --target=bpf -g -c -O2 t4.c -mcpu=v5
$ llvm-objdump -d t4.o

t4.o:   file format elf64-bpf

Disassembly of section .text:

0000000000000000 <foo>:
       0:       e9 10 00 00 00 00 00 00 w0 = load_acquire((u16 *)(r1 + 0x0))
       1:       95 00 00 00 00 00 00 00 exit
```
>From the source code, the operation should return a 'short' type value. But since the result will be 32bit value, should we do sign-extension here? w0 = load_acquire((s16 *)(r1 + 0x0))?

3. Currently you have
    def BPF_ATOMIC : BPFModeModifer<0x6>;
  +def BPF_MEMACQ : BPFModeModifer<0x7>;
  +def BPF_MEMREL : BPFModeModifer<0x7>;

  Since load_acquire and store_release are essentially atomic operations (e.g. __atomic_load_n() and __atomic_store_n()), is it possible to add acquire/release support in BPF_ATOMIC? We might want to reserve BPFModelModifier 7 for future use.

https://github.com/llvm/llvm-project/pull/108636