[cfe-dev] [LLVMdev] Address sanitizer regression test failures for PPC64 targets
Kostya Serebryany
kcc at google.com
Wed Oct 1 16:35:37 PDT 2014
On Wed, Oct 1, 2014 at 4:13 PM, Samuel F Antao <sfantao at us.ibm.com> wrote:
> Hi Kostya,
>
> Thanks for looking into this! Currently the test starts calling the
> destructor ~C() several times as it were in a infinite loop and it does not
> return.
>
Aha, I see the problem!
The test has an intentional bug that asan is supposed to find.
So, by default and with ASAN_OPTIONS=poison_array_cookie=1 the test should
never reach the point where it calls ~C.
Is this true for you on PPC?
But then, the test is also executed
with ASAN_OPTIONS=poison_array_cookie=0, i.e. the bug is not detected and
the execution goes further and tries to execute the DTOR.
And then yes, on a little-endian machine the DTOR will get executed 10
times, while on a big-endian one it will get executed near-infinite amount
of times.
Does r218841 help?
--kcc
>
> I'll try to explain how do I understand the problem. Just to make sure we
> are on the same page, I am attaching the instrumented IR that I obtain by
> running:
>
> clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer
> -fno-omit-frame-pointer -fno-optimize-sibling-calls -g -m64 -O3
> /home/sfantao/llvm-trunk/llvm-svn.src/projects/compiler-rt/test/asan/TestCases/Linux/new_array_cookie_test.cc
> -o debug.ir -S -emit-llvm
>
> In this code %x (i32*) and %9 (*i64) alias to %call. When 10 is stored to
> %x, the way this is reflected in a load from %9 (the ASAN calls use this
> pointer instead of %x) differs depending on the endianess. Assuming that %9
> and %x are 0x00, the memory layout before and after the store in big-endian
> will be
>
> Addr - Before - After
> 0x00 0xZZ 0xZZ
> 0x01 0xZZ 0xZZ
> 0x02 0xZZ 0xZZ
> 0x03 0xZZ 0x0A
> 0x04 0xZZ 0xZZ
> 0x05 0xZZ 0xZZ
> 0x06 0xZZ 0xZZ
> 0x07 0xZZ 0xZZ
>
> When a load using %9 is done, I get 0xZZZZZZ0AZZZZZZZZ. In a little-endian
> machine I would get 0xZZZZZZZZZZZZZZ0A instead, what is probably what you
> would expect. Then, when the destructor is called, whatever is decoding the
> size of 'buffer' loads the wrong information (possible zero or a very large
> number, causing the infinite loop).
>
> Any hint on how to fix this? I understand some other information is being
> encoded in the pointers, so it is hard for me to understand whether fixing
> this for %x would have bad implications in other components of the
> sanitizer.
>
> Let me know if you'd like me to provide more information.
>
> Thanks again!
> Samuel
>
>
>
>
>
>
>
> 2014-10-01 14:28 GMT-04:00 Kostya Serebryany <kcc at google.com>:
>
>
>>
>> On Mon, Sep 8, 2014 at 7:00 PM, Samuel F Antao <sfantao at us.ibm.com>
>> wrote:
>>
>>> Alexey, Alexander,
>>>
>>> Thanks for the suggestions. I tried removing the flag SA_NODEFER but it
>>> didn't do any good... I have been digging into the problem with the
>>> null_deref test today but I was unable to clearly identify the problem. I
>>> suspect that it was either a bug with the calling convention/unwinding that
>>> lead to the flags() pointer to get corrupted. It is also possible that it
>>> was related with endianess issues caused by some bug in the pointer
>>> arithmetic inserted by the sanitizer code (there are many type and bit
>>> casts which makes hard to follow the references). I decided to upgrade the
>>> compiler I was using to build clang which made the problem with this
>>> testcase to go away (!).
>>>
>>> Nevertheless, I still got problems in other testcases that may be
>>> potentially related with the problem I was getting before. E.g., in the
>>> new_array_cookie_test I am getting an infinite loop in the destructor of
>>> the array (delete [] operator). I noticed that the references passed to
>>> __asan_poison_cxx_array_cookie and __asan_load_cxx_array_cookie were
>>> pointing to values differing in the 4 most significant bytes, which made me
>>> suspect that the problem is related with endianess. I am reproducing part
>>> of the IR generated for this test:
>>>
>> [I am sorry, I've missed this thread. Don't hesitate to ping me if I
>> don't respond in 1-2 days. ]
>>
>> This is a new test for new functionality, currently present in clang's
>> asan, not in GCC.
>> We never tried it on big-endian machines.
>>
>>
>>>
>>> store i64 %0, i64* %9, align 8, !dbg !35, !nosanitize !2
>>> call void @__asan_poison_cxx_array_cookie(i64* %9), !dbg !35
>>> %10 = getelementptr inbounds i8* %call, i64 8, !dbg !35
>>> %11 = bitcast i8* %10 to %struct.C*, !dbg !35
>>> call void @llvm.dbg.value(metadata !{%struct.C* %11}, i64 0, metadata
>>> !23), !dbg !36
>>> %x = bitcast i8* %call to i32*, !dbg !37
>>> %12 = ptrtoint i32* %x to i64, !dbg !37
>>> %13 = lshr i64 %12, 3, !dbg !37
>>> %14 = add i64 %13, 2199023255552, !dbg !37
>>> %15 = inttoptr i64 %14 to i8*, !dbg !37
>>> %16 = load i8* %15, !dbg !37
>>> %17 = icmp ne i8 %16, 0, !dbg !37
>>> br i1 %17, label %18, label %24, !dbg !37, !prof !38
>>>
>>> ; <label>:18 ; preds = %entry
>>> %19 = and i64 %12, 7, !dbg !37
>>> %20 = add i64 %19, 3, !dbg !37
>>> %21 = trunc i64 %20 to i8, !dbg !37
>>> %22 = icmp sge i8 %21, %16, !dbg !37
>>> br i1 %22, label %23, label %24
>>>
>>> ; <label>:23 ; preds = %18
>>> call void @__asan_report_store4(i64 %12), !dbg !37
>>> call void asm sideeffect "", ""()
>>> unreachable
>>>
>>> ; <label>:24 ; preds = %18, %entry
>>> store i32 10, i32* %x, align 4, !dbg !37, !tbaa !39
>>> %25 = call i64 @__asan_load_cxx_array_cookie(i64* %9), !dbg !44
>>>
>>> In this code, %9 and %x alias but have different types (i64* and i32*),
>>> which makes the code in 'store i32 10, i32* %x, align 4, !dbg !37, !tbaa
>>> !39' to produce different results in machines with different endianess. In
>>> a big-endian machine the value 10 is written to the 4 most-significant
>>> bytes of the memory referenced by %9.
>>>
>>
>> How does the test behave on PPC?
>>
>>
>> --kcc
>>
>>>
>>>
>>> As I mentioned before, I don't know the sanitizer implementation well so
>>> it is possible I may be missing something. Can anyone shed some light on
>>> this?
>>>
>>> Thanks again!
>>> Samuel
>>>
>>> [image: Inactive hide details for Alexander Potapenko ---09/05/2014
>>> 02:06:43 AM---Note that I've set the SA_NODEFER flag for the SEGV h]Alexander
>>> Potapenko ---09/05/2014 02:06:43 AM---Note that I've set the SA_NODEFER
>>> flag for the SEGV handler in the ASan runtime only a couple of day
>>>
>>> From: Alexander Potapenko <glider at google.com>
>>> To: Alexey Samsonov <vonosmas at gmail.com>
>>> Cc: Samuel F Antao/Watson/IBM at IBMUS, Clang Developers List <
>>> cfe-dev at cs.uiuc.edu>, LLVM Dev <llvmdev at cs.uiuc.edu>
>>> Date: 09/05/2014 02:06 AM
>>> Subject: Re: [cfe-dev] Address sanitizer regression test failures for
>>> PPC64 targets
>>> ------------------------------
>>>
>>>
>>>
>>> Note that I've set the SA_NODEFER flag for the SEGV handler in the
>>> ASan runtime only a couple of days ago.
>>> Not sure that could've affected this test though; without that flag
>>> the second SEGV would've simply crashed the program. But you can try
>>> removing the flag from
>>> compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix_libcdep.cc and
>>> see if that makes any difference.
>>>
>>> HTH,
>>> Alex
>>>
>>> On Fri, Sep 5, 2014 at 5:26 AM, Alexey Samsonov <vonosmas at gmail.com>
>>> wrote:
>>> > +Bill Schmidt
>>> >
>>> > On Thu, Sep 4, 2014 at 5:39 PM, Samuel F Antao <sfantao at us.ibm.com>
>>> wrote:
>>> >>
>>> >> Hi all,
>>> >>
>>> >> I have been experiencing the failure of the address sanitizer
>>> regression
>>> >> tests for a PPC64 target (Power7 machine). My understanding is that
>>> most of
>>> >> the failures are related with the fact the stack is not being dumped.
>>> >>
>>> >> I tried to understand what might be wrong and started by looking into
>>> the
>>> >> null_deref.cc test as it hangs during the test run. I observe that
>>> after
>>> >> the detection of the faulty memory access it receives a SEGV after
>>> entering
>>> >> ReportSIGSEGV() more precisely when it gets to the
>>> __intercept_strlen() and
>>> >> tries to access flags()->replace_str. The caller of
>>> __intercept_strlen() is
>>> >> get_cie_encoding() from libgcc (version 4.8.2 in my system).
>>> >>
>>> >> As I am not familiar with the sanitizer implementation, I was
>>> wondering if
>>> >> this is an expected failure for PPC targets due to some incomplete
>>> >> implementation, an unexpected bug, or due to some misconfiguration in
>>> the
>>> >> Clang/LLVM build for PPC targets.
>>> >>
>>> >> Has anyone experienced a similar issue?
>>> >
>>> >
>>> > Sanitizer used to work on PPC at some point, but currently it fails on
>>> most
>>> > of the tests from "check-asan" test suite on the PowerPC buildbot
>>> > (http://lab.llvm.org:8011/builders/sanitizer-ppc64-linux1).
>>> > I can't really diagnose the issue from your description. flags() is
>>> just a
>>> > pointer to a global variable, so I don't see why access to
>>> > flags()->replace_str will segfault.
>>> >
>>> >>
>>> >>
>>> >>
>>> >> Thanks in advance!
>>> >> Samuel
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> cfe-dev mailing list
>>> >> cfe-dev at cs.uiuc.edu
>>> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Alexey Samsonov
>>> > vonosmas at gmail.com
>>> >
>>> > _______________________________________________
>>> > cfe-dev mailing list
>>> > cfe-dev at cs.uiuc.edu
>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>> >
>>>
>>>
>>>
>>> --
>>> Alexander Potapenko
>>> Software Engineer
>>> Google Moscow
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141001/3a443536/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141001/3a443536/attachment.gif>
More information about the cfe-dev
mailing list