[llvm-bugs] [Bug 50139] New: atomic load turned into non-atomic

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Apr 27 03:18:39 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=50139

            Bug ID: 50139
           Summary: atomic load turned into non-atomic
           Product: clang
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: release blocker
          Priority: P
         Component: LLVM Codegen
          Assignee: unassignedclangbugs at nondot.org
          Reporter: dvyukov at google.com
                CC: llvm-bugs at lists.llvm.org, neeilans at live.com,
                    richard-llvm at metafoo.co.uk

Originally reported by Manuel Pöter (@mpoeter) here:
https://github.com/google/sanitizers/issues/1398

If I am reading this correctly, this issues leads at least a bad miscompilation
in release build on arm64, to performance pessimization on amd64 and to false
positives from ThreadSanitizer, which initially brought it to light. The arm64
codegen issue looks especially problematic, it can lead to very hard to debug
episodic memory corruptions. I don't rule out possibility of bad codegen on x86
as well for slightly different code.
I confirmed it on the latest HEAD 3feb84a36f5128dd0f6e9c65314609a9ce4372bc, but
the issue looks old, I think I can repro it back to clang-7 (don't have older
versions).
I tentatively set release-blocker because of the arm64 implications. Feel free
to downgrade.

Here is the code:

struct Foo {
  unsigned *ptr = nullptr;
  bool cond = true;
  unsigned a1 = 0;
  unsigned a2 = 0;
  unsigned foo();
};

unsigned Foo::foo() {
  unsigned oldest_snapshot;
  if (!ptr) {
    oldest_snapshot = cond
                          ? __atomic_load_n(&a1, __ATOMIC_ACQUIRE)
                          : __atomic_load_n(&a2, __ATOMIC_ACQUIRE);
  } else {
    oldest_snapshot = *ptr;
  }
  return oldest_snapshot;
}

x86 performance issue:

$ clang++ test.cc -O1 -g -c && objdump -d test.o

0000000000000000 <_ZN3Foo3fooEv>:
   0:   48 8b 07                mov    (%rdi),%rax
   3:   48 85 c0                test   %rax,%rax
   6:   74 03                   je     b <_ZN3Foo3fooEv+0xb>
   8:   8b 00                   mov    (%rax),%eax
   a:   c3                      retq   
   b:   80 7f 08 00             cmpb   $0x0,0x8(%rdi)
   f:   74 09                   je     1a <_ZN3Foo3fooEv+0x1a>
  11:   8b 47 0c                mov    0xc(%rdi),%eax
  14:   48 83 c7 0c             add    $0xc,%rdi
  18:   eb 07                   jmp    21 <_ZN3Foo3fooEv+0x21>
  1a:   8b 47 10                mov    0x10(%rdi),%eax
  1d:   48 83 c7 10             add    $0x10,%rdi
  21:   48 89 f8                mov    %rdi,%rax
  24:   8b 00                   mov    (%rax),%eax
  26:   c3                      retq   


1a-21 is the atomic load, which result is discarded
then 24 is a duplicate non-atomic load
instructions 1a-21 are pointless

arm64 issue (I only confirmed with slightly older compiler version):

0000000000000000 <_ZN3Foo3fooEv>:
   0:   f9400008        ldr     x8, [x0]
   4:   b4000068        cbz     x8, 10 <_ZN3Foo3fooEv+0x10>
   8:   b9400100        ldr     w0, [x8]
   c:   d65f03c0        ret
  10:   39402008        ldrb    w8, [x0, #8]
  14:   34000068        cbz     w8, 20 <_ZN3Foo3fooEv+0x20>
  18:   91003008        add     x8, x0, #0xc
  1c:   14000002        b       24 <_ZN3Foo3fooEv+0x24>
  20:   91004008        add     x8, x0, #0x10
  24:   88dffd1f        ldar    wzr, [x8]
  28:   b9400100        ldr     w0, [x8]
  2c:   d65f03c0        ret

The same load duplication, but you can see that the first one is
atomic-acquire, but the second one misses the acquire part. This can lead to
arbitrary memory corruptions.

Finally, false positive with -fsanitize=thread -O1:

00000000004d5010 <_ZN3Foo3fooEv>:
  4d5010:       41 56                   push   %r14
  4d5012:       53                      push   %rbx
  4d5013:       50                      push   %rax
  4d5014:       49 89 fe                mov    %rdi,%r14
  4d5017:       48 8b 7c 24 18          mov    0x18(%rsp),%rdi
  4d501c:       e8 bf 24 ff ff          callq  4c74e0 <__tsan_func_entry>
  4d5021:       4c 89 f7                mov    %r14,%rdi
  4d5024:       e8 f7 b4 fe ff          callq  4c0520 <__tsan_read8>
  4d5029:       49 8b 1e                mov    (%r14),%rbx
  4d502c:       48 85 db                test   %rbx,%rbx
  4d502f:       75 3c                   jne    4d506d <_ZN3Foo3fooEv+0x5d>
  4d5031:       49 8d 7e 08             lea    0x8(%r14),%rdi
  4d5035:       e8 96 9b fe ff          callq  4bebd0 <__tsan_read1>
  4d503a:       41 80 7e 08 00          cmpb   $0x0,0x8(%r14)
  4d503f:       74 17                   je     4d5058 <_ZN3Foo3fooEv+0x48>
  4d5041:       49 8d 5e 0c             lea    0xc(%r14),%rbx
  4d5045:       48 89 df                mov    %rbx,%rdi
  4d5048:       be 02 00 00 00          mov    $0x2,%esi
  4d504d:       e8 3e 5f fd ff          callq  4aaf90 <__tsan_atomic32_load>
  4d5052:       41 8b 46 0c             mov    0xc(%r14),%eax
  4d5056:       eb 15                   jmp    4d506d <_ZN3Foo3fooEv+0x5d>
  4d5058:       49 8d 5e 10             lea    0x10(%r14),%rbx
  4d505c:       48 89 df                mov    %rbx,%rdi
  4d505f:       be 02 00 00 00          mov    $0x2,%esi
  4d5064:       e8 27 5f fd ff          callq  4aaf90 <__tsan_atomic32_load>
  4d5069:       41 8b 46 10             mov    0x10(%r14),%eax
  4d506d:       48 89 df                mov    %rbx,%rdi
  4d5070:       e8 2b ac fe ff          callq  4bfca0 <__tsan_read4>
  4d5075:       8b 1b                   mov    (%rbx),%ebx
  4d5077:       e8 04 25 ff ff          callq  4c7580 <__tsan_func_exit>
  4d507c:       89 d8                   mov    %ebx,%eax
  4d507e:       48 83 c4 08             add    $0x8,%rsp
  4d5082:       5b                      pop    %rbx
  4d5083:       41 5e                   pop    %r14
  4d5085:       c3                      retq

Again, atomic load with discarded result:

  4d5058:       49 8d 5e 10             lea    0x10(%r14),%rbx
  4d505c:       48 89 df                mov    %rbx,%rdi
  4d505f:       be 02 00 00 00          mov    $0x2,%esi
  4d5064:       e8 27 5f fd ff          callq  4aaf90 <__tsan_atomic32_load>

Followed by non-atomic duplicate load:

  4d5069:       41 8b 46 10             mov    0x10(%r14),%eax
  4d506d:       48 89 df                mov    %rbx,%rdi
  4d5070:       e8 2b ac fe ff          callq  4bfca0 <__tsan_read4>
  4d5075:       8b 1b                   mov    (%rbx),%ebx


On this code snippet it seems to happen only with -O1, but Manuel reports it
with -O3 as well:
https://github.com/google/sanitizers/issues/1398#issuecomment-826996680

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210427/3b70ad42/attachment-0001.html>


More information about the llvm-bugs mailing list