[PATCH] D149486: [RISCV] Strengthen atomic ordering for sequentially consistent stores

Mon May 1 16:09:23 PDT 2023

paulkirth added a comment.

In D149486#4310956 <https://reviews.llvm.org/D149486#4310956>, @jrtc27 wrote:

> With my psABI hat on I'm still very concerned about proposing breaking the ABI in future. Getting memory ordering wrong leads to a world of pain with subtle bugs that can only be reproduced on some systems some of the time. The current mapping is the de-facto ABI that should have been written down more formally, and there needs to be an extremely strong motivator to break it, not just "a small amount of code will get slightly faster". Note that I'm not claiming that's what's going on, but reading the mailing list threads doesn't immediately give a clear picture, and so those that care about this need to come forward with a clearly laid out explanation for why the current state is insufficient and must be changed to the proposed state, bearing in mind the pain that comes from breaking ABI and that there will be a many year interim period where performance is *worse* than both of the before and after states.

I think there are two key things:

1. from https://github.com/riscv/riscv-isa-manual/pull/992/files

  Assuming suitable mappings for other atomic operations, setting the `aq` bit on the LR instruction, 
  and setting the `rl` bit on the SC instruction makes the LR/SC sequence sequentially consistent in the 
  C++ `memory_order_seq_cst` sense. Such a sequence does not act as a fence for ordering ordinary 
  load and store instructions before and after the sequence.

2. from https://github.com/riscv/riscv-isa-manual/pull/992/files#diff-b199c03e79944ce7dfb81f721f27d40252590e4daaacc8d9a414a6c4d881f737R1228

  Even more importantly, a Table A.6 sequentially consistent store, followed by a Table A.7 sequentially consistent load 
  can be reordered unless the Table A.6 mapping of stores is strengthened by either adding a second fence or mapping 
  the store to `amoswap.rl` instead.

What these tell me is that the existing A.6 mappings are actually insufficient to implement the semantics they say they do, and users will experience some surprising results when when switching to A.7(or even if they don't).  I'm not an expert in memory models, so I'll hapily defer to people w/ more expertise than I have on the subtleties here, but from the linked discussions it sounds to me that what we currently have isn't quite right.

Another point I've heard brought up a lot in these discussions, is that right now there isn't a huge amount of atomics code for RISC-V, but that will likely rapidly change. So, we need to decide if we would rather go through the pain now and impact a much smaller group of people, or defer the change and impacting significantly more code/users. Given that view, I think waiting longer to fix this will only make the problem worse. We have an opportunity to allow //most// code moving forward to avoid ABI breaks in the future. This is unfortunate, but we should fix these things as we are able, IMO.

Lastly, I think it's worth considering that there is a discrepancy between what GCC has emitted (and would emit after the patches linked above) and what we're doing now, so there are already some issues with the RISC-V atomics ABI. That's based on the discussions in the GCC patch and the tech-unprivileged thread.

However, I think a lot of this discussion is better suited to the psABI and related communities than on this patch. Perhaps we should recap the discusison so far on https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/378?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149486/new/

https://reviews.llvm.org/D149486