[PATCH] D57820: [AArch64] Use CAS loops for all atomic operations when available.

Mon Feb 18 04:33:48 PST 2019

john.brawn requested changes to this revision.
john.brawn added a comment.
This revision now requires changes to proceed.

In D57820#1399586 <https://reviews.llvm.org/D57820#1399586>, @john.brawn wrote:

> It's a bit hard to know whether to accept this or not, given that you says it's "probably better to avoid LDXR". Do you have any data to say if it is any better or not?

To answer my own question: I quickly threw together this test program:

  #include <iostream>
  #include <atomic>
  #include <thread>
  #include <chrono>

  std::atomic<__int128> var;

  #define COUNT 100000

  int thread_fn(int n) {
    for (int i = 0; i < COUNT; ++i) {
      var.store(n*COUNT + i,  std::memory_order_relaxed);
    }
    return 0;
  }

  int main() {
    std::thread *threads[32];
    std::chrono::high_resolution_clock::time_point start, end;
    start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < 32; ++i)
      threads[i] = new std::thread(thread_fn, i);
    for (int i = 0; i < 32; ++i)
      threads[i]->join();
    end =  std::chrono::high_resolution_clock::now();
    std::cout << (end-start).count() << "\n";
    return 0;
  }

and running it on a thunderx2 (compiled -O2 -mcpu=native, taking average of 20 runs) I get:

| without path | 1760231879 |
| with patch   | 3411562006 |
|

so it looks like this patch actually makes things worse.

In D57820#1399588 <https://reviews.llvm.org/D57820#1399588>, @jfb wrote:

> > Incidentally the handling of volatile atomic operations here is probably wrong as I don't think we can have any kind of repeated loading, but that's nothing to do with this change (both LDXR/STXR loop and CAS loop have this problem) and doesn't need to be fixed by it.
>
> `volatile atomic` is generally a bit of a minefield. If we don't have an instruction that does a straight load then there's not much we can do...

According to http://llvm.org/docs/Atomics.html#unordered "Note that code generation will fail for unsupported atomic operations; if you need such an operation, use explicit locking.", so I guess we should give an error.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57820/new/

https://reviews.llvm.org/D57820