[llvm-commits] [PATCH] AddressSanitizer Initialization Order Checking

Mon Jul 23 14:17:47 PDT 2012

Hello again,

I've updated according to most of the solutions, and restructured a lot of
the poisoning code.
Performance looks great, but there's still a bit of work to convert my
hand-written tests to the ASan output_tests format.
I'm hoping to get this patch accepted with initialization order off by
default, then submit a separate patch to include all of my test cases.

Details on previous issues follow, patch is attached, and rietveld issues
are at
llvm: http://codereview.appspot.com/6432065/
compiler-rt: http://codereview.appspot.com/6419070/

Thanks again,
Reid

On Thu, Jul 19, 2012 at 2:28 AM, Kostya Serebryany <kcc at google.com> wrote:

>
>
> On Thu, Jul 19, 2012 at 3:16 AM, Reid Watson <reidw at google.com> wrote:
>
>> Thanks again for reviewing this -- I've implemented the suggested
>> changes, uploaded to Rietveld(http://codereview.appspot.com/6425049),
>>
> Thanks! (although you've uploaded only half of the patch).
>
> Comments on the LLVM part:
>
> +static bool HasDynamicInitializer(GlobalVariable *G) {
>
>
>
> I am not sure how reliable this function is.
> Can we get this information from clang in a form of metadata?
> I don't insists to fix this now, but this will need a fix eventually.
>
> +    if (CurrentFunction->getName().find("__cxx_global_var_init") == 0) {
>
>
> Shouldn't this be _GLOBAL__I_a?
>
>
_GLOBAL__I_a will call each of the __cxx_global_var_init methods.
This seems to work for now, but yes, a more robust method of checking this
would be nice, and I'll look into this.

> +  //   size_t poison_during_initialization;
>
>
> I'd call this has_dynamic_initializer
>
>
> Please end the comments with a period.
>
>
Done.

> Comments on the compiler-rt part:
>
> +  ScopedLock lock(&mu_for_globals);
>
>
> +  if (n == 0)
>
>
> +    return;
>
> Move if (n == 0)  before the lock.
>
>
>
> +  while (l->g->beg != last->beg) {
>
> Do you really need these 3 loops?
> At the point where you call this, the globals of the current TU are in the
> head of the loop.
> So, you probably can do just one loop starting from 'first'.
>
> +  while (l) {
>
>
> +    if (l->g->poison_during_initialization)
>
>
> I am worried about the performance of this thing.
> I would prefer to have two different lists of globals, one containing only
> those globals that "has_dynamic_initializer".
> You can do this change later if you prefer, but before enabling the option
> by default.
>

I've restructured the way poisoning works in the patch attached to this
email, simplifying this code and improving performance quite a bit.

>
> Same for tests: you will need to add tests, not necessary in the first
> commit, but before flipping the asan-initialization-order flag.
> I will ask you for at least these kinds of tests:
>  - llvm-style tests in the form of .ll file with CHECK statements.
>  - "output" style tests (see projects/compiler-rt/lib/asan/output_tests)
> with the actual bugs.
>  - stress test: a shell script that auto-generates a program with N C++
> modules each containing M1 globals with dynamic init and M2 globals with
> static init.
>    I want to see how O(N*M1) looks like in practice. We should not see
> O(N*(M1+M2)).
>
>
I'm still in the process of converting my own test cases into the ASan
output_test format.
Almost all of my tests require multiple files, so I may need to tweak
test_output.sh a little bit.
As far as performance is concerned, things look great.
I'm working on finishing up the stress test, and will include it with the
testing patch.

>
>> and attached the new patches to this email.
>>
>> I left in the extra check which forces access to global scalar
>> variables to be implemented.  It remains necessary to catch the first
>> case in my previous email,
>
>
> Please remind me that case.
> I think we need something like this:
>
> if (ClOpt && ClOptGlobals)
>    if(GlobalVariable *G = dyn_cast<GlobalVariable>(Addr)
> && !HasDynamicInitializer(G)) { ...
>
> where HasDynamicInitializer is fast (e.g. reads metadata)
>
>
Yes, the decision to restrict what we catch to only variables with dynamic
initialization solves this.
The initial implementation attempted to catch cases where two initializers
in different modules modified a global variable in a third distinct module.
The order of modification to that global would be unspecified.
An example of this sort of bug would be calling strerror, (or a user
defined function which operated in a similar manner) inside of two
initializers in different TUs.
However, catching this error required a fair bit of complexity
(synchronization could be use, so we would effectively be stuck trying to
write a less useful version of ThreadSanitizer) and introduced a number of
false positives.

>
>
>> and all initialization order problems in
>> which the value being initialized is some simple type.  If the
>> performance hit is unacceptable though,
>
>
> Performance drop might be acceptable, but I don't want to disable a useful
> optimization for no reason.
>
> --kcc
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120723/90a453cc/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: asan-initialization-order-llvm.patch
Type: application/octet-stream
Size: 14570 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120723/90a453cc/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: asan-initialization-order-compilerrt.patch
Type: application/octet-stream
Size: 8090 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120723/90a453cc/attachment-0001.obj>