[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk

Haoran Xu via llvm-dev llvm-dev at lists.llvm.org
Thu Oct 22 00:12:47 PDT 2020


Hi David,

I just tried creduce, but it generated a code with completely undefined
behavior (out of range accesses, etc).
The "interesting" criteria I used is "timeout in -O1 but finishes in 20s in
-O0". However, it turns out that the undefined behavior in
creduce-generated code just makes the criteria "happens to" pass.

Best,
Haoran

Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午10:32写道:

> I was just able to determine the offending IR code before and after the
> transformation. I'm now almost certain it's a bug in LLVM.
>
> Before transformation, we have the following IR (I renamed all %xxx for
> brevity):
>
>> %1 = load i8, i8* %0, align 1
>> %2 = add i8 %1, -1
>> store i8 %2, i8* %0, align 1
>>
> The above IR is inside a loop, so the value in %0 can be different in each
> run.
>
> The optimization pass changed the IR above to the following:
>
>> store i8 %3, i8* %0, align 1
>>
> where %3 is defined by
>
>> %4 = load i8, i8* %0, align 1
>> %3 = add i8 %4, -1
>>
> in an earlier piece of IR.
>
> Apparently the pass treated %3 the same thing as %2 and it fired CSE,
> without realizing that the content in %0 may have been changed by the loop.
>
>
> David Blaikie <dblaikie at gmail.com> 于2020年10月21日周三 下午10:18写道:
>
>> Might be worth running the c source file through creduce or similar to
>> narrow it down a bit that way too.
>>
>> On Wed, Oct 21, 2020 at 9:12 PM Haoran Xu via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> A further bisect using opt's -opt-bisect-limit option shows that the
>>> following pass is causing the issue:
>>>
>>>> BISECT: running pass (39) Early CSE w/ MemorySSA on function (main)
>>>>
>>>
>>>
>>> Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午9:00写道:
>>>
>>>> I did a simple bisect on clang version, and it seems like clang 8.0.0
>>>> works correctly, but clang 9.0.0 failed to compile the code correctly.
>>>> https://godbolt.org/z/676Grr  <- if you change the clang version to
>>>> 8.0.0, you will see the expected output in 'output' section.
>>>> I don't have the ability to bisect on clang git history. I would
>>>> greatly appreciate it if any one is willing to do that.
>>>>
>>>> Thanks!
>>>>
>>>> Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午8:47写道:
>>>>
>>>>> Hello,
>>>>>
>>>>> I'm really amazed to find out that under -O3, a simple piece of C code
>>>>> generated from a brainfxxk-to-C transpiler is miscompiled.
>>>>> As one probably know, the C code transpiled from brainfxxk only
>>>>> contains 3 kind of statements:
>>>>>
>>>>>> (1) ++(*ptr) / --(*ptr)
>>>>>> (2) ++ptr / --ptr
>>>>>> (3) while (*ptr) { ... }
>>>>>>
>>>>> where ptr is a uint8_t*.
>>>>> So it seems very clear to me that the code contains no undefined
>>>>> behavior (the pointer is uint8_t* and unsigned integer overflow is not UD).
>>>>>
>>>>> After further investigation, it seems like clang compiled this loop:
>>>>>
>>>>>> while (*ptr) {
>>>>>>  --(*ptr);
>>>>>>  ++ptr;
>>>>>>  ++(*ptr);
>>>>>>  --ptr;
>>>>>> }
>>>>>>
>>>>>  to an unconditional infinite loop under -O3, resulting in the bug.
>>>>> The code snippet above seems completely benign to me.
>>>>>
>>>>> I attached the offending program. With
>>>>>
>>>>>> clang a.c -O0
>>>>>>
>>>>> it worked fine (it should print out an ASCII-art picture of mandelbrot
>>>>> fracture). However, with -O1 or -O3, it goes into a dead loop (in the code
>>>>> snippet above) after printing out a few characters.
>>>>>
>>>>> I also tried UndefinedBehaviorSanitizer. Strangely, when compiling
>>>>> using
>>>>>
>>>>>> clang a.c -O3  -fsanitize=undefined
>>>>>>
>>>>> the code worked again, with no infinite loop, and no undefined
>>>>> behavior reported.
>>>>>
>>>>> So it seems to me a LLVM optimizer bug. I would greatly appreciate if
>>>>> any one is willing to investigate.
>>>>>
>>>>> Best,
>>>>> Haoran
>>>>>
>>>>>
>>>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201022/c314991e/attachment.html>


More information about the llvm-dev mailing list