[LLVMdev] Proposed Enhancement to AddressSanitizer: Initialization Order

Kostya Serebryany kcc at google.com
Tue Jun 26 03:43:16 PDT 2012


+llvmdev, -llvm-dev

On Tue, Jun 26, 2012 at 2:28 PM, Kostya Serebryany <kcc at google.com> wrote:

> Hi Reid,
>
> On Tue, Jun 26, 2012 at 4:30 AM, Reid Watson <reidw at google.com> wrote:
>
>> Hello,
>>
>> I'm starting work on a project to detect initialization order problems
>> in C++ files using AddressSanitizer.
>> The extension in question will hopefully result in AddressSanitizer
>> being able to detect initializers which read an undefined value from a
>> static or global variable defined in another TU.
>> I'm currently working on this as a patch to AddressSanitizer, but I'm
>> open to suggestions as to what the proper way to implement this
>> extension would be.
>>
>> One of the simplest examples of this is the following example:
>> It is undefined what this program will output, and it's fairly easy to
>> see this behavior.
>>
>> When compiled as:
>> $ clang++ file_1.cpp file_2.cpp main.cpp
>> $./a.out
>> x: 2
>> y: 1
>>
>> However, when compiled as:
>> $ clang++ file_2.cpp file_1.cpp main.cpp
>> $./a.out
>> x: 1
>> y: 2
>>
>> //file_1.cpp
>> extern int y;
>> int x = y + 1;
>>
>> //file_2.cpp
>> extern int x;
>> int y = x + 1;
>>
>> //main.cpp
>> #include <iostream>
>> extern int x,y;
>>
>> int main(){
>>   std::cout << "x: " << x << std::endl;
>>   std::cout << "y: " << y << std::endl;
>> }
>>
>> Here's a sketch of the detection algorithm:
>> For each TU:
>>     1. Before each TU's initializers run, conditionally poison the
>> global variable shadow memory
>>         -Each global variable is poisoned, unless it was defined in that
>> TU
>>         -Additional information is added to struct __asan_global to
>> identify which TU a global was declared in
>>
>
> This could be tricky.
> First, we don't want to poison the linker-initialized globals because they
> are always initialized regardless the TU order.
>
> Second, consider we have 3 TUs, t1, t2, and t3, each has a global (g1, g2
> and g3) with initializer.
> When we are running initializers in t2, we need to poison g1 and g3, but
> so far we have seen only g1.
> I don't know any good and portable way to get g3.
>
> One solution is to run the binary twice: once with the default order of TU
> initializers, and second time with the reverted order (not sure if that's
> easy).
>
> We probably don't want to do all that when initializing globals in a
> dlopen-ed library, or in any situation when we have multiple threads.
>
>
>
>>     2. Instrument all reads and writes in global initializers
>>
>
> This has been fixed today (thanks Nick!)
>
>
>>     3. After each TU's initializers run, we unpoison the shadow memory
>> for all global variables
>>
>
> Once we know what globals we need to poison, un-poisoning them is trivial.
>
>
>>
>> Note that once main has started running, AddressSanitizer will run
>> normally.  This will result in AddressSanitizer catching all
>> reads/writes to global variables defined in other TUs.
>> We run all of AddressSanitizer after initialization because we cannot
>> know prior to the completion of initialization which functions will be
>> called from initializers.
>>
>> I'd welcome any feedback on this proposal!
>>
>
> Sounds cool, make it happen!
>
> --kcc
>
>
>>
>> All the best,
>> Reid
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120626/a0aa4dde/attachment.html>


More information about the llvm-dev mailing list