[llvm] BPF: Generate locked insn for __sync_fetch_and_add() with cpu v1/v2 (PR #106494)

Thu Aug 29 17:01:56 PDT 2024

================
@@ -171,81 +168,14 @@ bool BPFMIPreEmitChecking::processAtomicInsts() {
       }
     }
   }
-
----------------
4ast wrote:

> if no return value, the atomic_fetch_and() will become locked and insn. 

There is no "locked and" prior to v3.
atomic_fetch_and() should always generate atomic_fetch_and().

atomic_fetch_ADD() is the only exception, because we had xadd insn from the beginning.

> Should we just ensure all atomic_fetch_*() insns (except atomic_fetch_add()) only supports cpu=v3?

I don't think we need to add such restriction.
If code has  atomic_fetch_XX() it should stay atomic_fetch_XX insn regardless of v1,v2,v3.
The only exception is XX==ADD, because we had this special xadd insn early on and had
this buggy atomic_fetch_add -> xadd code generation that we have to preserve for backward compat reasons.

Even with -mcpu=v1 it's fine to translate sync_fetch_and_xor() to atomic_fetch_xor insn.
We do that for may_goto insn too. -mcpu=vX makes a difference when compiler has a choice.
For v1 it can generate one sequence while for v3 another sequence of insns.
For may_goto and for sync_fetch_and_xor there is no choice. Either generate that insn or error.
It's better to generate working code than error. -mcpu is not a binding contract.

https://github.com/llvm/llvm-project/pull/106494