[PATCH] D56238: [TSan] Support Objective-C @synchronized with tagged pointers

Fri Jan 4 11:44:12 PST 2019

yln marked 2 inline comments as done.
yln added inline comments.

================
Comment at: lib/tsan/rtl/tsan_interceptors_mac.cc:302
+// have an associated memory allocation. The Obj-C runtime uses tagged pointers
+// to transparently optimize small objects.
+static bool IsTaggedObjCPointer(id obj) {
----------------
dvyukov wrote:
> I still wonder what does synchronized itself do to lock them?
> Since the optimization is transparent, it suggest that these things still have reference identity rather than values identity. But I fail to see how we respect this reference identity. Consider, we have two different objects that are small and converted to a tagger pointer with the same value (say, integer 1). Now we will use the same address for these 2 objects because they have the same value, so we think they are the same. Since we merge them we can get false deadlock reports and all kinds of bad stuff. But then I am confused how synchronized distinguish them.
Ah, I think I now understand your question. Sorry for not communicating more clearly.

The Obj-C runtime maintains a mapping: <pointer value -> lock>. So two pointers with the same value (pointer  bits) will map to the same lock, regardless whether or not they are backed by an allocation. So our implementation already replicates the behavior of the Obj-C runtime.

We considered reporting a blanket warning any time @synchronized is used with a tagged pointer, since it might have surprising behavior from the programmer's point of view. However, Devin made a convincing point that tagged pointers are only one special case of a larger issue here.
In general, it is bad practice to lock on objects that you don't own/allocate/control the lifetime of. Think of a cache for small numbers, or interned string literals. If you use @synchronized with, e.g., two numbers of the same value, with the expectation that they are different objects, then it doesn't matter if they are tagged pointers or two pointers pointing to the same object. In both cases your assumption is broken. So this would need a more general warning to make sure programmers explicitly manage locks.

Summary:
@synchronized locks on pointer values. We already replicate this behavior (I hope ;)

================
Comment at: lib/tsan/rtl/tsan_interceptors_mac.cc:319
+  if (h.created()) {
+    *h = (uptr) InternalAlloc(/*size=*/1);
+  }
----------------
dvyukov wrote:
> It's safer/better to use a user allocation. Internal allocations may not be covered by shadow in future (not sure it has not in all configurations). And mutexes have limited functionality in non-app mem.
With `user_alloc` TSan reports a race in the `objc-synchronize-tagged.mm` test. I do not completely understand what the cause is: the address/allocation is supposed to be racy by nature since it represents a mutex, which itself is used to establish synchronization. Do you have any insights here?
Plain old `malloc` seems to work. 

Repository:
  rCRT Compiler Runtime

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56238/new/

https://reviews.llvm.org/D56238