[llvm-commits] ThreadSanitizer, first patch. Please review.

Tue Feb 7 12:40:44 PST 2012

On Mon, Feb 6, 2012 at 7:35 PM, Nick Lewycky <nlewycky at google.com> wrote:

> On 6 February 2012 19:10, Kostya Serebryany <kcc at google.com> wrote:
>
>> Thanks for the review!
>> Please take another look.
>>
>
> Wrong patch attached?
>
>
>>
>> On Mon, Feb 6, 2012 at 2:35 PM, Nick Lewycky <nlewycky at google.com> wrote:
>>
>>> On 30 January 2012 09:45, Kostya Serebryany <kcc at google.com> wrote:
>>>
>>>> Any feedback?
>>>
>>>
>>> --- test/Instrumentation/ThreadSanitizer/tsan_basic.ll (revision 0)
>>> +++ test/Instrumentation/ThreadSanitizer/tsan_basic.ll (revision 0)
>>> @@ -0,0 +1,17 @@
>>> +; RUN: opt < %s -tsan -S | FileCheck %s
>>> +
>>> +target datalayout =
>>> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
>>> +target triple = "x86_64-unknown-linux-gnu"
>>> +
>>> +define i32 @read_4_bytes(i32* %a) {
>>> +entry:
>>> +  %tmp1 = load i32* %a, align 4
>>> +  ret i32 %tmp1
>>> +}
>>> +; CHECK: @read_4_bytes
>>> +; CHECK-NOT: ret i32
>>> +; CHECK: __tsan_func_entry
>>> +; CHECK: __tsan_read4
>>> +; CHECK: __tsan_func_exit
>>> +; CHECK: ret i32
>>> +; CHECK: __tsan_init
>>>
>>> I'm not a fan of this pile of CHECK statements. You could actually write
>>> out the IR that you expect to get out and preserve the order using
>>> CHECK-NEXT. Even if you don't want the matches to be too specific, it would
>>> still be clearer to see "CHECK-NEXT: call {{.*}} @__tsan_func_entry".
>>>
>> done
>>
>>
>>>
>>> Index: lib/Transforms/Instrumentation/ThreadSanitizer.cpp
>>> ===================================================================
>>> --- lib/Transforms/Instrumentation/ThreadSanitizer.cpp (revision 0)
>>> +++ lib/Transforms/Instrumentation/ThreadSanitizer.cpp (revision 0)
>>> @@ -0,0 +1,149 @@
>>> +//===-- ThreadSanitizer.cpp - race detector ---------------------*- C++
>>> -*-===//
>>>
>>> Hey hey, this ruler is 81 characters long. :)
>>>
>>
>> Hm? My editor says 80 (81 counting the '+' in the patch).
>>
>
> Sorry, false alarm. (Not sure how that happened; my editor said "82", but
> I wasn't using my usual editor at the time.)
>
>
>>
>>
>>
>>>
>>> +bool ThreadSanitizer::runOnModule(Module &M) {
>>> +  bool Res = false;
>>> +  CurrentModule = &M;
>>> +  TD = getAnalysisIfAvailable<TargetData>();
>>> +  if (!TD)
>>> +    return false;
>>> +
>>> +  for (Module::iterator F = M.begin(), E = M.end(); F != E; ++F) {
>>> +    if (F->isDeclaration()) continue;
>>> +    Res |= handleFunction(M, *F);
>>> +  }
>>> +  // We instrumented at least one function. Insert a call to
>>> __tsan_init().
>>> +  if (Res) {
>>> +    IRBuilder<> IRB(M.getContext());
>>> +    Value *TsanInit = M.getOrInsertFunction("__tsan_init",
>>> +                                            IRB.getVoidTy(), NULL);
>>> +    appendToGlobalCtors(M, cast<Function>(TsanInit), 0);
>>> +  }
>>> +  return Res;
>>> +}
>>>
>>> You could write this as a FunctionPass; create the __tsan_... methods in
>>> doInitialization() (and store the necessary pointers in your object so that
>>> you needn't look them up each time you instrument a function), then do the
>>> instrumentation per-function, then in doFinalization either add the call to
>>> __tsan_init or clean up declarations you created in doInitialization().
>>>
>>
>> Rewrote to use FunctionPass.
>> However I have questions:
>>    - Suppose I want to count the number of transformed instructions
>> in runOnFunction and store the result in the ThreadSanitizer object. Do I
>> need to use locking or atomics (OMG)?
>>
>
> Yes, you would.
>
>
>>    - You suggest to call M->getOrInsertFunction in doInitialization. Bu
>> that will require 5x2+2=12 class members that will decrease the readability.
>>
>
> Feel free to have an array or whatnot to collapse them :)
>
>
>>     Also, there is no guarantee that all these objects will ever be used,
>> so this is a kind of pessimization.
>>     Besides, if getOrInsertFunction is slow, isn't it better to make it
>> faster, than to manually optimize calls?
>>     We can do a lazy init for these objects, but then again what is the
>> preferred way to do lazy init in the FunctionPass to avoid races?
>>
>
> The trouble is that a FunctionPass isn't supposed to be creating or
> deleting Function (or other Globals). This is part of the threading model.
> See http://llvm.org/docs/WritingAnLLVMPass.html#FunctionPass .
>
> If you find these restrictions too hard to live with you can go back to
> using a ModulePass. in reality, we have plenty FunctionPasses which do
> manipulate global state against the rules, but I'm trying to make TSan
> perfect.
>

"make TSan perfect" sounds good :)
PTAL.

http://codereview.appspot.com/5545054/

--kcc

>
>
>>
>>> +      else if (isa<CallInst>(BI))
>>> +        HasCalls = true;
>>>
>>>
>>
>>
>>> What about invokes?
>>>
>>>
>> Yea, sure. Done.
>>
>>
>>> +  // Instrument memory accesses.
>>> +  for (size_t i = 0, n = LoadsAndStores.size(); i < n; ++i) {
>>> +    Res |= instrumentLoadOrStore(LoadsAndStores[i]);
>>> +  }
>>>
>>> Why not just call instrumentLoadOrStore as you encounter them? It seems
>>> that the vector is just overhead.
>>>
>>
>> *in future* we will need to analyze this array and remove some of its
>> elements.
>> Now it just looks a bit cleaner to me.
>>
>>
>> --kcc
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120207/cf2196df/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: issue5545054_17001.diff
Type: text/x-patch
Size: 9967 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120207/cf2196df/attachment.bin>