[LLVMdev] Proposed Enhancement to AddressSanitizer: Initialization Order

Kostya Serebryany kcc at google.com
Tue Jun 26 04:15:22 PDT 2012


On Tue, Jun 26, 2012 at 2:43 PM, Kostya Serebryany <kcc at google.com> wrote:

> +llvmdev, -llvm-dev
>
>
> On Tue, Jun 26, 2012 at 2:28 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>> Hi Reid,
>>
>> On Tue, Jun 26, 2012 at 4:30 AM, Reid Watson <reidw at google.com> wrote:
>>
>>> Hello,
>>>
>>> I'm starting work on a project to detect initialization order problems
>>> in C++ files using AddressSanitizer.
>>> The extension in question will hopefully result in AddressSanitizer
>>> being able to detect initializers which read an undefined value from a
>>> static or global variable defined in another TU.
>>> I'm currently working on this as a patch to AddressSanitizer, but I'm
>>> open to suggestions as to what the proper way to implement this
>>> extension would be.
>>>
>>> One of the simplest examples of this is the following example:
>>> It is undefined what this program will output, and it's fairly easy to
>>> see this behavior.
>>>
>>> When compiled as:
>>> $ clang++ file_1.cpp file_2.cpp main.cpp
>>> $./a.out
>>> x: 2
>>> y: 1
>>>
>>> However, when compiled as:
>>> $ clang++ file_2.cpp file_1.cpp main.cpp
>>> $./a.out
>>> x: 1
>>> y: 2
>>>
>>> //file_1.cpp
>>> extern int y;
>>> int x = y + 1;
>>>
>>> //file_2.cpp
>>> extern int x;
>>> int y = x + 1;
>>>
>>> //main.cpp
>>> #include <iostream>
>>> extern int x,y;
>>>
>>> int main(){
>>>   std::cout << "x: " << x << std::endl;
>>>   std::cout << "y: " << y << std::endl;
>>> }
>>>
>>> Here's a sketch of the detection algorithm:
>>> For each TU:
>>>     1. Before each TU's initializers run, conditionally poison the
>>> global variable shadow memory
>>>         -Each global variable is poisoned, unless it was defined in that
>>> TU
>>>         -Additional information is added to struct __asan_global to
>>> identify which TU a global was declared in
>>>
>>
>> This could be tricky.
>> First, we don't want to poison the linker-initialized globals because
>> they are always initialized regardless the TU order.
>>
>> Second, consider we have 3 TUs, t1, t2, and t3, each has a global (g1, g2
>> and g3) with initializer.
>> When we are running initializers in t2, we need to poison g1 and g3, but
>> so far we have seen only g1.
>> I don't know any good and portable way to get g3.
>>
>> One solution is to run the binary twice: once with the default order of
>> TU initializers, and second time with the reverted order (not sure if
>> that's easy).
>>
>
Or it might be a bit simpler...
Currently, asan creates an unnamed linker-initialized global array for all
instrumented globals in a given module.

% cat glob.cc
int foo();
int bar();
int AAA = foo();
int BBB = bar();

% clang -O2 -faddress-sanitizer -S -o - -emit-llvm  glob.cc
...
@ 2 = private global [2 x { i64, i64, i64, i64 }] [{ i64, i64, i64, i64 } {
i64 ptrtoint ({ i32, [60 x i8] }* @AAA to i64), i64 4, i64 64, i64 ptrtoint
([14 x i8]* @0 to i64) }, { i64, i64, i64, i64 } { i64 ptrtoint ({ i32, [60
x i8] }* @BBB to i64), i64 4, i64 64, i64 ptrtoint ([14 x i8]* @1 to i64) }]
...

If we make this array discoverable by other modules (using appending
linkage?), the problem is solved.


--kcc







>
>> We probably don't want to do all that when initializing globals in a
>> dlopen-ed library, or in any situation when we have multiple threads.
>>
>>
>>
>>>     2. Instrument all reads and writes in global initializers
>>>
>>
>> This has been fixed today (thanks Nick!)
>>
>>
>>>     3. After each TU's initializers run, we unpoison the shadow memory
>>> for all global variables
>>>
>>
>> Once we know what globals we need to poison, un-poisoning them is
>> trivial.
>>
>>
>>>
>>> Note that once main has started running, AddressSanitizer will run
>>> normally.  This will result in AddressSanitizer catching all
>>> reads/writes to global variables defined in other TUs.
>>> We run all of AddressSanitizer after initialization because we cannot
>>> know prior to the completion of initialization which functions will be
>>> called from initializers.
>>>
>>> I'd welcome any feedback on this proposal!
>>>
>>
>> Sounds cool, make it happen!
>>
>> --kcc
>>
>>
>>>
>>> All the best,
>>> Reid
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120626/36fe62b1/attachment.html>


More information about the llvm-dev mailing list