[llvm-bugs] [Bug 50139] New: atomic load turned into non-atomic
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Apr 27 03:18:39 PDT 2021
https://bugs.llvm.org/show_bug.cgi?id=50139
Bug ID: 50139
Summary: atomic load turned into non-atomic
Product: clang
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: release blocker
Priority: P
Component: LLVM Codegen
Assignee: unassignedclangbugs at nondot.org
Reporter: dvyukov at google.com
CC: llvm-bugs at lists.llvm.org, neeilans at live.com,
richard-llvm at metafoo.co.uk
Originally reported by Manuel Pöter (@mpoeter) here:
https://github.com/google/sanitizers/issues/1398
If I am reading this correctly, this issues leads at least a bad miscompilation
in release build on arm64, to performance pessimization on amd64 and to false
positives from ThreadSanitizer, which initially brought it to light. The arm64
codegen issue looks especially problematic, it can lead to very hard to debug
episodic memory corruptions. I don't rule out possibility of bad codegen on x86
as well for slightly different code.
I confirmed it on the latest HEAD 3feb84a36f5128dd0f6e9c65314609a9ce4372bc, but
the issue looks old, I think I can repro it back to clang-7 (don't have older
versions).
I tentatively set release-blocker because of the arm64 implications. Feel free
to downgrade.
Here is the code:
struct Foo {
unsigned *ptr = nullptr;
bool cond = true;
unsigned a1 = 0;
unsigned a2 = 0;
unsigned foo();
};
unsigned Foo::foo() {
unsigned oldest_snapshot;
if (!ptr) {
oldest_snapshot = cond
? __atomic_load_n(&a1, __ATOMIC_ACQUIRE)
: __atomic_load_n(&a2, __ATOMIC_ACQUIRE);
} else {
oldest_snapshot = *ptr;
}
return oldest_snapshot;
}
x86 performance issue:
$ clang++ test.cc -O1 -g -c && objdump -d test.o
0000000000000000 <_ZN3Foo3fooEv>:
0: 48 8b 07 mov (%rdi),%rax
3: 48 85 c0 test %rax,%rax
6: 74 03 je b <_ZN3Foo3fooEv+0xb>
8: 8b 00 mov (%rax),%eax
a: c3 retq
b: 80 7f 08 00 cmpb $0x0,0x8(%rdi)
f: 74 09 je 1a <_ZN3Foo3fooEv+0x1a>
11: 8b 47 0c mov 0xc(%rdi),%eax
14: 48 83 c7 0c add $0xc,%rdi
18: eb 07 jmp 21 <_ZN3Foo3fooEv+0x21>
1a: 8b 47 10 mov 0x10(%rdi),%eax
1d: 48 83 c7 10 add $0x10,%rdi
21: 48 89 f8 mov %rdi,%rax
24: 8b 00 mov (%rax),%eax
26: c3 retq
1a-21 is the atomic load, which result is discarded
then 24 is a duplicate non-atomic load
instructions 1a-21 are pointless
arm64 issue (I only confirmed with slightly older compiler version):
0000000000000000 <_ZN3Foo3fooEv>:
0: f9400008 ldr x8, [x0]
4: b4000068 cbz x8, 10 <_ZN3Foo3fooEv+0x10>
8: b9400100 ldr w0, [x8]
c: d65f03c0 ret
10: 39402008 ldrb w8, [x0, #8]
14: 34000068 cbz w8, 20 <_ZN3Foo3fooEv+0x20>
18: 91003008 add x8, x0, #0xc
1c: 14000002 b 24 <_ZN3Foo3fooEv+0x24>
20: 91004008 add x8, x0, #0x10
24: 88dffd1f ldar wzr, [x8]
28: b9400100 ldr w0, [x8]
2c: d65f03c0 ret
The same load duplication, but you can see that the first one is
atomic-acquire, but the second one misses the acquire part. This can lead to
arbitrary memory corruptions.
Finally, false positive with -fsanitize=thread -O1:
00000000004d5010 <_ZN3Foo3fooEv>:
4d5010: 41 56 push %r14
4d5012: 53 push %rbx
4d5013: 50 push %rax
4d5014: 49 89 fe mov %rdi,%r14
4d5017: 48 8b 7c 24 18 mov 0x18(%rsp),%rdi
4d501c: e8 bf 24 ff ff callq 4c74e0 <__tsan_func_entry>
4d5021: 4c 89 f7 mov %r14,%rdi
4d5024: e8 f7 b4 fe ff callq 4c0520 <__tsan_read8>
4d5029: 49 8b 1e mov (%r14),%rbx
4d502c: 48 85 db test %rbx,%rbx
4d502f: 75 3c jne 4d506d <_ZN3Foo3fooEv+0x5d>
4d5031: 49 8d 7e 08 lea 0x8(%r14),%rdi
4d5035: e8 96 9b fe ff callq 4bebd0 <__tsan_read1>
4d503a: 41 80 7e 08 00 cmpb $0x0,0x8(%r14)
4d503f: 74 17 je 4d5058 <_ZN3Foo3fooEv+0x48>
4d5041: 49 8d 5e 0c lea 0xc(%r14),%rbx
4d5045: 48 89 df mov %rbx,%rdi
4d5048: be 02 00 00 00 mov $0x2,%esi
4d504d: e8 3e 5f fd ff callq 4aaf90 <__tsan_atomic32_load>
4d5052: 41 8b 46 0c mov 0xc(%r14),%eax
4d5056: eb 15 jmp 4d506d <_ZN3Foo3fooEv+0x5d>
4d5058: 49 8d 5e 10 lea 0x10(%r14),%rbx
4d505c: 48 89 df mov %rbx,%rdi
4d505f: be 02 00 00 00 mov $0x2,%esi
4d5064: e8 27 5f fd ff callq 4aaf90 <__tsan_atomic32_load>
4d5069: 41 8b 46 10 mov 0x10(%r14),%eax
4d506d: 48 89 df mov %rbx,%rdi
4d5070: e8 2b ac fe ff callq 4bfca0 <__tsan_read4>
4d5075: 8b 1b mov (%rbx),%ebx
4d5077: e8 04 25 ff ff callq 4c7580 <__tsan_func_exit>
4d507c: 89 d8 mov %ebx,%eax
4d507e: 48 83 c4 08 add $0x8,%rsp
4d5082: 5b pop %rbx
4d5083: 41 5e pop %r14
4d5085: c3 retq
Again, atomic load with discarded result:
4d5058: 49 8d 5e 10 lea 0x10(%r14),%rbx
4d505c: 48 89 df mov %rbx,%rdi
4d505f: be 02 00 00 00 mov $0x2,%esi
4d5064: e8 27 5f fd ff callq 4aaf90 <__tsan_atomic32_load>
Followed by non-atomic duplicate load:
4d5069: 41 8b 46 10 mov 0x10(%r14),%eax
4d506d: 48 89 df mov %rbx,%rdi
4d5070: e8 2b ac fe ff callq 4bfca0 <__tsan_read4>
4d5075: 8b 1b mov (%rbx),%ebx
On this code snippet it seems to happen only with -O1, but Manuel reports it
with -O3 as well:
https://github.com/google/sanitizers/issues/1398#issuecomment-826996680
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210427/3b70ad42/attachment-0001.html>
More information about the llvm-bugs
mailing list