[cfe-dev] [LLVMdev] Address sanitizer regression test failures for PPC64 targets

Samuel F Antao sfantao at us.ibm.com
Wed Oct 1 16:13:03 PDT 2014


Hi Kostya,

Thanks for looking into this! Currently the test starts calling the
destructor ~C() several times as it were in a infinite loop and it does not
return.

I'll try to explain how do I understand the problem. Just to make sure we
are on the same page, I am attaching the instrumented IR that I obtain by
running:

clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer
-fno-omit-frame-pointer -fno-optimize-sibling-calls -g -m64 -O3
/home/sfantao/llvm-trunk/llvm-svn.src/projects/compiler-rt/test/asan/TestCases/Linux/new_array_cookie_test.cc
-o debug.ir -S -emit-llvm

In this code %x (i32*) and %9 (*i64) alias to %call. When 10 is stored to
%x, the way this is reflected in a load from %9 (the ASAN calls use this
pointer instead of %x) differs depending on the endianess. Assuming that %9
and %x are 0x00, the memory layout before and after the store in big-endian
will be

Addr  - Before - After
0x00    0xZZ   0xZZ
0x01    0xZZ   0xZZ
0x02    0xZZ   0xZZ
0x03    0xZZ   0x0A
0x04    0xZZ   0xZZ
0x05    0xZZ   0xZZ
0x06    0xZZ   0xZZ
0x07    0xZZ   0xZZ

When a load using %9 is done, I get 0xZZZZZZ0AZZZZZZZZ. In a little-endian
machine I would get 0xZZZZZZZZZZZZZZ0A instead, what is probably what you
would expect. Then, when the destructor is called, whatever is decoding the
size of 'buffer' loads the wrong information (possible zero or a very large
number, causing the infinite loop).

Any hint on how to fix this? I understand some other information is being
encoded in the pointers, so it is hard for me to understand whether fixing
this for %x would have bad implications in other components of the
sanitizer.

Let me know if you'd like me to provide more information.

Thanks again!
Samuel







2014-10-01 14:28 GMT-04:00 Kostya Serebryany <kcc at google.com>:

>
>
> On Mon, Sep 8, 2014 at 7:00 PM, Samuel F Antao <sfantao at us.ibm.com> wrote:
>
>> Alexey, Alexander,
>>
>> Thanks for the suggestions. I tried removing the flag SA_NODEFER but it
>> didn't do any good... I have been digging into the problem with the
>> null_deref test today but I was unable to clearly identify the problem. I
>> suspect that it was either a bug with the calling convention/unwinding that
>> lead to the flags() pointer to get corrupted. It is also possible that it
>> was related with endianess issues caused by some bug in the pointer
>> arithmetic inserted by the sanitizer code (there are many type and bit
>> casts which makes hard to follow the references). I decided to upgrade the
>> compiler I was using to build clang which made the problem with this
>> testcase to go away (!).
>>
>> Nevertheless, I still got problems in other testcases that may be
>> potentially related with the problem I was getting before. E.g., in the
>> new_array_cookie_test I am getting an infinite loop in the destructor of
>> the array (delete [] operator). I noticed that the references passed to
>> __asan_poison_cxx_array_cookie and __asan_load_cxx_array_cookie were
>> pointing to values differing in the 4 most significant bytes, which made me
>> suspect that the problem is related with endianess. I am reproducing part
>> of the IR generated for this test:
>>
> [I am sorry, I've missed this thread. Don't hesitate to ping me if I don't
> respond in 1-2 days. ]
>
> This is a new test for new functionality, currently present in clang's
> asan, not in GCC.
> We never tried it on big-endian machines.
>
>
>>
>>   store i64 %0, i64* %9, align 8, !dbg !35, !nosanitize !2
>>   call void @__asan_poison_cxx_array_cookie(i64* %9), !dbg !35
>>   %10 = getelementptr inbounds i8* %call, i64 8, !dbg !35
>>   %11 = bitcast i8* %10 to %struct.C*, !dbg !35
>>   call void @llvm.dbg.value(metadata !{%struct.C* %11}, i64 0, metadata
>> !23), !dbg !36
>>   %x = bitcast i8* %call to i32*, !dbg !37
>>   %12 = ptrtoint i32* %x to i64, !dbg !37
>>   %13 = lshr i64 %12, 3, !dbg !37
>>   %14 = add i64 %13, 2199023255552, !dbg !37
>>   %15 = inttoptr i64 %14 to i8*, !dbg !37
>>   %16 = load i8* %15, !dbg !37
>>   %17 = icmp ne i8 %16, 0, !dbg !37
>>   br i1 %17, label %18, label %24, !dbg !37, !prof !38
>>
>> ; <label>:18                                      ; preds = %entry
>>   %19 = and i64 %12, 7, !dbg !37
>>   %20 = add i64 %19, 3, !dbg !37
>>   %21 = trunc i64 %20 to i8, !dbg !37
>>   %22 = icmp sge i8 %21, %16, !dbg !37
>>   br i1 %22, label %23, label %24
>>
>> ; <label>:23                                      ; preds = %18
>>   call void @__asan_report_store4(i64 %12), !dbg !37
>>   call void asm sideeffect "", ""()
>>   unreachable
>>
>> ; <label>:24                                      ; preds = %18, %entry
>>   store i32 10, i32* %x, align 4, !dbg !37, !tbaa !39
>>   %25 = call i64 @__asan_load_cxx_array_cookie(i64* %9), !dbg !44
>>
>> In this code, %9 and %x alias but have different types (i64* and i32*),
>> which makes the code in 'store i32 10, i32* %x, align 4, !dbg !37, !tbaa
>> !39' to produce different results in machines with different endianess. In
>> a big-endian machine the value 10 is written to the 4 most-significant
>> bytes of the memory referenced by %9.
>>
>
> How does the test behave on PPC?
>
>
> --kcc
>
>>
>>
>> As I mentioned before, I don't know the sanitizer implementation well so
>> it is possible I may be missing something. Can anyone shed some light on
>> this?
>>
>> Thanks again!
>> Samuel
>>
>> [image: Inactive hide details for Alexander Potapenko ---09/05/2014
>> 02:06:43 AM---Note that I've set the SA_NODEFER flag for the SEGV h]Alexander
>> Potapenko ---09/05/2014 02:06:43 AM---Note that I've set the SA_NODEFER
>> flag for the SEGV handler in the ASan runtime only a couple of day
>>
>> From: Alexander Potapenko <glider at google.com>
>> To: Alexey Samsonov <vonosmas at gmail.com>
>> Cc: Samuel F Antao/Watson/IBM at IBMUS, Clang Developers List <
>> cfe-dev at cs.uiuc.edu>, LLVM Dev <llvmdev at cs.uiuc.edu>
>> Date: 09/05/2014 02:06 AM
>> Subject: Re: [cfe-dev] Address sanitizer regression test failures for
>> PPC64 targets
>> ------------------------------
>>
>>
>>
>> Note that I've set the SA_NODEFER flag for the SEGV handler in the
>> ASan runtime only a couple of days ago.
>> Not sure that could've affected this test though; without that flag
>> the second SEGV would've simply crashed the program. But you can try
>> removing the flag from
>> compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix_libcdep.cc and
>> see if that makes any difference.
>>
>> HTH,
>> Alex
>>
>> On Fri, Sep 5, 2014 at 5:26 AM, Alexey Samsonov <vonosmas at gmail.com>
>> wrote:
>> > +Bill Schmidt
>> >
>> > On Thu, Sep 4, 2014 at 5:39 PM, Samuel F Antao <sfantao at us.ibm.com>
>> wrote:
>> >>
>> >> Hi all,
>> >>
>> >> I have been experiencing the failure of the address sanitizer
>> regression
>> >> tests for a PPC64 target (Power7 machine). My understanding is that
>> most of
>> >> the failures are related with the fact the stack is not being dumped.
>> >>
>> >> I tried to understand what might be wrong and started by looking into
>> the
>> >> null_deref.cc test as it hangs during the test run.  I observe that
>> after
>> >> the detection of the faulty memory access it receives a SEGV after
>> entering
>> >> ReportSIGSEGV() more precisely when it gets to the
>> __intercept_strlen() and
>> >> tries to access  flags()->replace_str. The caller of
>> __intercept_strlen() is
>> >> get_cie_encoding() from libgcc (version 4.8.2 in my system).
>> >>
>> >> As I am not familiar with the sanitizer implementation, I was
>> wondering if
>> >> this is an expected failure for PPC targets due to some incomplete
>> >> implementation, an unexpected bug, or due to some misconfiguration in
>> the
>> >> Clang/LLVM build for PPC targets.
>> >>
>> >> Has anyone experienced a similar issue?
>> >
>> >
>> > Sanitizer used to work on PPC at some point, but currently it fails on
>> most
>> > of the tests from "check-asan" test suite on the PowerPC buildbot
>> > (http://lab.llvm.org:8011/builders/sanitizer-ppc64-linux1).
>> > I can't really diagnose the issue from your description. flags() is
>> just a
>> > pointer to a global variable, so I don't see why access to
>> > flags()->replace_str will segfault.
>> >
>> >>
>> >>
>> >>
>> >> Thanks in advance!
>> >> Samuel
>> >>
>> >>
>> >> _______________________________________________
>> >> cfe-dev mailing list
>> >> cfe-dev at cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> >>
>> >
>> >
>> >
>> > --
>> > Alexey Samsonov
>> > vonosmas at gmail.com
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> >
>>
>>
>>
>> --
>> Alexander Potapenko
>> Software Engineer
>> Google Moscow
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141001/7fed606a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141001/7fed606a/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debug.ir
Type: application/octet-stream
Size: 17477 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141001/7fed606a/attachment.obj>


More information about the cfe-dev mailing list