[LLVMdev] trunk's optimizer generates slower code than 3.5

Jack Howarth howarth.mailing.lists at gmail.com
Sat Feb 14 09:44:50 PST 2015


Filed as http://llvm.org/bugs/show_bug.cgi?id=22589.

On Sat, Feb 14, 2015 at 12:31 PM, Jack Howarth
<howarth.mailing.lists at gmail.com> wrote:
> Oops. I misspoke. The 22% performance regression is in fact eliminated
> in current llvm/clang trunk. Hopefully this is due to a single fix
> that can be back ported rather than some large change in the code.
>
> On Sat, Feb 14, 2015 at 12:18 PM, Jack Howarth
> <howarth.mailing.lists at gmail.com> wrote:
>> The same 22% performance regression also exists in current llvm/clang
>> trunk for the SciMark2 Sparse matmult benchmark.
>>
>> On Sat, Feb 14, 2015 at 12:11 PM, Jack Howarth
>> <howarth.mailing.lists at gmail.com> wrote:
>>>     Using the SciMark 2.0 code from
>>> http://math.nist.gov/scimark2/scimark2_1c.zip compiled with the
>>> same...
>>>
>>> make CFLAGS="-O3 -march=native"
>>>
>>> I am able to reproduce the 22% performance regression in the run time
>>> of the Sparse matmult benchmark.
>>> For 10 runs of the scimark2 benechmark, I get 998.439+/-0.4828 with
>>> the release llvm clang 3.5.1 compiler
>>> and  1217.363+/-1.1004 for the current clang 3.6svn from 3.6 branch. Not good.
>>>                    Jack
>>>
>>> On Sat, Feb 14, 2015 at 11:19 AM, Jack Howarth
>>> <howarth.mailing.lists at gmail.com> wrote:
>>>>    Do any of the build-bots routinely run the SciMark v2.0 benchmark?
>>>> If so, might not an examination of those logs reveal the commit range
>>>> at which the optimizations in that benchmark degraded?
>>>>             Jack
>>>>
>>>> On Sat, Feb 14, 2015 at 11:13 AM, Jack Howarth
>>>> <howarth.mailing.lists at gmail.com> wrote:
>>>>>     The regressions in the performance of generated  code, introduced
>>>>> by the llvm 3.6 release, don't seem to be limited to this 8 queens
>>>>> puzzle" solver test case. See...
>>>>>
>>>>> http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1
>>>>>
>>>>> where a bit hit in the performance of the Sparse Matrix Multiply test
>>>>> of the SciMark v2.0 benchmark was observed as well as others.
>>>>>     Do you really want to release 3.6 with this level of performance regression?
>>>>>             Jack
>>>>>
>>>>> On Fri, Feb 13, 2015 at 2:47 PM, Jack Howarth
>>>>> <howarth.mailing.lists at gmail.com> wrote:
>>>>>> Also confirmed with the llvm 3.5.1 release and the llvm 3.6 release
>>>>>> branch on x86_64-apple-darwin14...
>>>>>>
>>>>>> % clang-3.5 -O3 -mssse3 -fomit-frame-pointer -fno-stack-protector
>>>>>> -fno-exceptions -o 8 8.c
>>>>>> % time ./8 9
>>>>>> 352 solutions
>>>>>> 3.603u 0.002s 0:03.60 100.0% 0+0k 0+0io 2pf+0w
>>>>>> % time ./8 10
>>>>>> 724 solutions
>>>>>> 104.217u 0.059s 1:44.30 99.9% 0+0k 0+0io 2pf+0w
>>>>>>
>>>>>> % clang-3.6 -O3 -mssse3 -fomit-frame-pointer -fno-stack-protector
>>>>>> -fno-exceptions -o 8 8.c
>>>>>> % time ./8 9
>>>>>> 352 solutions
>>>>>> 4.050u 0.001s 0:04.05 100.0% 0+0k 0+0io 2pf+0w
>>>>>> % time ./8 10
>>>>>> 724 solutions
>>>>>> 114.808u 0.041s 1:54.86 99.9% 0+0k 0+0io 2pf+0w
>>>>>>
>>>>>> On Fri, Feb 13, 2015 at 3:37 AM, 191919 <191919 at gmail.com> wrote:
>>>>>>> I submitted the problem report to clang's bugzilla but no one seems to
>>>>>>> care so I have to send it to the mailing list.
>>>>>>>
>>>>>>> clang 3.7 svn (trunk 229055 as the time I was to report this problem)
>>>>>>> generates slower code than 3.5 (Apple LLVM version 6.0
>>>>>>> (clang-600.0.56) (based on LLVM 3.5svn)) for the following code.
>>>>>>>
>>>>>>> It is a "8 queens puzzle" solver written as an educational example. As
>>>>>>> compiled by both clang 3.5 and 3.7, it gave the correct answer, but
>>>>>>> clang 3.5 generates code which runs 20% faster than 3.6/3.7.
>>>>>>>
>>>>>>> ##########################################
>>>>>>> # clang 3.5 which comes with Xcode 6.1.1
>>>>>>> ##########################################
>>>>>>> $ clang -O3 -mssse3 -fomit-frame-pointer -fno-stack-protector
>>>>>>> -fno-exceptions -o 8 8.c
>>>>>>> $ time ./8 9    # 9 queens
>>>>>>> 352 solutions
>>>>>>> $ time ./8 10   # 10 queens
>>>>>>> ./8 9  1.63s user 0.00s system 99% cpu 1.632 total
>>>>>>> 724 solutions
>>>>>>> ./8 10  45.11s user 0.01s system 99% cpu 45.121 total
>>>>>>>
>>>>>>> ##########################################
>>>>>>> # clang 3.7 svn trunk
>>>>>>> ##########################################
>>>>>>> $ /opt/bin/clang -O3 -mssse3 -fomit-frame-pointer -fno-stack-protector
>>>>>>> -fno-exceptions -o 8 8.c
>>>>>>> $ time ./8 9    # 9 queens
>>>>>>> 352 solutions
>>>>>>> ./8 9  2.07s user 0.00s system 99% cpu 2.078 total
>>>>>>> $ time ./8 10   # 10 queens
>>>>>>> 724 solutions
>>>>>>> ./8 10  56.63s user 0.02s system 99% cpu 56.650 total
>>>>>>>
>>>>>>> The source code is below, I also attached the executable files as well
>>>>>>> as the assembly code files for clang 3.5 and 3.6 by IDA.
>>>>>>>
>>>>>>> The performance is even worse when compiling as 32-bit code while
>>>>>>> gcc-4.9.2 is not affected.
>>>>>>>
>>>>>>> ########## clang-3.5
>>>>>>> $ clang -m32 -O3 -fomit-frame-pointer -fno-stack-protector
>>>>>>> -fno-exceptions -o 8 8.c
>>>>>>> $ time ./8 9
>>>>>>> 352 solutions
>>>>>>> ./8 9  1.95s user 0.00s system 99% cpu 1.950 total
>>>>>>>
>>>>>>> ########## clang-3.7
>>>>>>> $ /opt/bin/clang -m32 -O3 -fomit-frame-pointer -fno-stack-protector
>>>>>>> -fno-exceptions -o 8 8.c
>>>>>>> $ time ./8 9
>>>>>>> 352 solutions
>>>>>>> ./8 9  2.48s user 0.00s system 99% cpu 2.480 total
>>>>>>>
>>>>>>> ######### gcc-4.9.2
>>>>>>> $ /opt/bin/gcc -m32 -O3 -fomit-frame-pointer -fno-stack-protector
>>>>>>> -fno-exceptions -o 8 8.c
>>>>>>> $ time ./8 9
>>>>>>> 352 solutions
>>>>>>> ./8 9  1.44s user 0.00s system 99% cpu 1.442 total
>>>>>>>
>>>>>>>
>>>>>>> ```
>>>>>>> #include <stdio.h>
>>>>>>> #include <stdlib.h>
>>>>>>>
>>>>>>> static inline int validate(int* a, int d)
>>>>>>> {
>>>>>>>         int i, j, x;
>>>>>>>         for (i = 0; i < d; ++i)
>>>>>>>         {
>>>>>>>                 for (j = i+1, x = 1; j < d; ++j, ++x)
>>>>>>>                 {
>>>>>>>                         const int d = a[i] - a[j];
>>>>>>>                         if (d == 0 || d == -x || d == x) return 0;
>>>>>>>                 }
>>>>>>>         }
>>>>>>>         return 1;
>>>>>>> }
>>>>>>>
>>>>>>> static inline int solve(int d)
>>>>>>> {
>>>>>>>         int r = 0;
>>>>>>>         int* a = (int*) calloc(sizeof(int), d+1);
>>>>>>>         int p = d - 1;
>>>>>>>
>>>>>>>         for (;;)
>>>>>>>         {
>>>>>>>                 a[p]++;
>>>>>>>
>>>>>>>                 if (a[p] > d-1)
>>>>>>>                 {
>>>>>>>                         int bp = p - 1;
>>>>>>>                         while (bp >= 0)
>>>>>>>                         {
>>>>>>>                                 a[bp]++;
>>>>>>>                                 if (a[bp] <= d-1) break;
>>>>>>>                                 a[bp] = 0;
>>>>>>>                                 --bp;
>>>>>>>                         }
>>>>>>>                         if (bp < 0)
>>>>>>>                                 break;
>>>>>>>                         a[p] = 0;
>>>>>>>                 }
>>>>>>>                 if (validate(a, d))
>>>>>>>                 {
>>>>>>>                         ++r;
>>>>>>>                 }
>>>>>>>         }
>>>>>>>
>>>>>>>         free(a);
>>>>>>>         return r;
>>>>>>> }
>>>>>>>
>>>>>>> int main(int argc, char** argv)
>>>>>>> {
>>>>>>>     if (argc != 2) return -1;
>>>>>>>     int r = solve((int) strtol(argv[1], NULL, 10));
>>>>>>>     printf("%d solutions\n", r);
>>>>>>> }
>>>>>>> ```
>>>>>>>
>>>>>>> clang 3.5's result:
>>>>>>>
>>>>>>> ```
>>>>>>>                 public _main
>>>>>>> _main           proc near
>>>>>>>
>>>>>>> var_48          = qword ptr -48h
>>>>>>> var_40          = qword ptr -40h
>>>>>>> var_34          = dword ptr -34h
>>>>>>>
>>>>>>>                 push    rbp
>>>>>>>                 push    r15
>>>>>>>                 push    r14
>>>>>>>                 push    r13
>>>>>>>                 push    r12
>>>>>>>                 push    rbx
>>>>>>>                 sub     rsp, 18h
>>>>>>>                 mov     ebx, 0FFFFFFFFh
>>>>>>>                 cmp     edi, 2
>>>>>>>                 jnz     loc_100000F29
>>>>>>>                 mov     rdi, [rsi+8]    ; char *
>>>>>>>                 xor     r14d, r14d
>>>>>>>                 xor     esi, esi        ; char **
>>>>>>>                 mov     edx, 0Ah        ; int
>>>>>>>                 call    _strtol
>>>>>>>                 mov     r15, rax
>>>>>>>                 shl     rax, 20h
>>>>>>>                 mov     rsi, offset __mh_execute_header
>>>>>>>                 add     rsi, rax
>>>>>>>                 sar     rsi, 20h        ; size_t
>>>>>>>                 mov     edi, 4          ; size_t
>>>>>>>                 call    _calloc
>>>>>>>                 lea     edx, [r15-1]
>>>>>>>                 movsxd  r8, edx
>>>>>>>                 mov     ecx, r15d
>>>>>>>                 add     ecx, 0FFFFFFFEh
>>>>>>>                 js      loc_100000DFA
>>>>>>>                 test    r15d, r15d
>>>>>>>                 mov     r11d, [rax+r8*4]
>>>>>>>                 jle     loc_100000EAE
>>>>>>>                 mov     ecx, r15d
>>>>>>>                 add     ecx, 0FFFFFFFEh
>>>>>>>                 mov     [rsp+48h+var_34], ecx
>>>>>>>                 movsxd  rcx, ecx
>>>>>>>                 lea     rcx, [rax+rcx*4]
>>>>>>>                 mov     [rsp+48h+var_40], rcx
>>>>>>>                 lea     rcx, [rax+4]
>>>>>>>                 mov     [rsp+48h+var_48], rcx
>>>>>>>                 xor     r14d, r14d
>>>>>>>                 jmp     short loc_100000D33
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>                 align 10h
>>>>>>>
>>>>>>> loc_100000D30:                          ; CODE XREF: _main+129 j
>>>>>>>                                         ; _main+131 j ...
>>>>>>>                 add     r14d, ebx
>>>>>>>
>>>>>>> loc_100000D33:                          ; CODE XREF: _main+92 j
>>>>>>>                 cmp     r11d, edx
>>>>>>>                 lea     edi, [r11+1]
>>>>>>>                 mov     [rax+r8*4], edi
>>>>>>>                 mov     rcx, [rsp+48h+var_40]
>>>>>>>                 mov     esi, [rsp+48h+var_34]
>>>>>>>                 mov     r11d, edi
>>>>>>>                 jl      short loc_100000D84
>>>>>>>                 nop     dword ptr [rax+00h]
>>>>>>>
>>>>>>> loc_100000D50:                          ; CODE XREF: _main+DA j
>>>>>>>                 mov     edi, [rcx]
>>>>>>>                 lea     ebp, [rdi+1]
>>>>>>>                 mov     [rcx], ebp
>>>>>>>                 cmp     edi, edx
>>>>>>>                 jl      short loc_100000D71
>>>>>>>                 mov     dword ptr [rcx], 0
>>>>>>>                 add     rcx, 0FFFFFFFFFFFFFFFCh
>>>>>>>                 test    esi, esi
>>>>>>>                 lea     esi, [rsi-1]
>>>>>>>                 jg      short loc_100000D50
>>>>>>>                 jmp     loc_100000F0E
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000D71:                          ; CODE XREF: _main+C9 j
>>>>>>>                 test    esi, esi
>>>>>>>                 js      loc_100000F0E
>>>>>>>                 mov     dword ptr [rax+r8*4], 0
>>>>>>>                 xor     r11d, r11d
>>>>>>>
>>>>>>> loc_100000D84:                          ; CODE XREF: _main+BA j
>>>>>>>                 cmp     r15d, 1
>>>>>>>                 mov     esi, 0
>>>>>>>                 mov     r9, [rsp+48h+var_48]
>>>>>>>                 mov     r12d, 1
>>>>>>>                 jle     short loc_100000DF0
>>>>>>>
>>>>>>> loc_100000D99:                          ; CODE XREF: _main+15E j
>>>>>>>                 mov     r10d, [rax+rsi*4]
>>>>>>>                 mov     ecx, 0FFFFFFFFh
>>>>>>>                 mov     edi, 1
>>>>>>>                 mov     r13, r9
>>>>>>>                 nop     word ptr [rax+rax+00h]
>>>>>>>
>>>>>>> loc_100000DB0:                          ; CODE XREF: _main+14F j
>>>>>>>                 xor     ebx, ebx
>>>>>>>                 mov     ebp, r10d
>>>>>>>                 sub     ebp, [r13+0]
>>>>>>>                 jz      loc_100000D30
>>>>>>>                 cmp     ecx, ebp
>>>>>>>                 jz      loc_100000D30
>>>>>>>                 cmp     edi, ebp
>>>>>>>                 jz      loc_100000D30
>>>>>>>                 add     r13, 4
>>>>>>>                 inc     rdi
>>>>>>>                 dec     ecx
>>>>>>>                 mov     ebx, edi
>>>>>>>                 add     ebx, esi
>>>>>>>                 cmp     ebx, r15d
>>>>>>>                 jl      short loc_100000DB0
>>>>>>>                 inc     r12
>>>>>>>                 add     r9, 4
>>>>>>>                 inc     rsi
>>>>>>>                 cmp     r12d, r15d
>>>>>>>                 jl      short loc_100000D99
>>>>>>>
>>>>>>> loc_100000DF0:                          ; CODE XREF: _main+107 j
>>>>>>>                 mov     ebx, 1
>>>>>>>                 jmp     loc_100000D30
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000DFA:                          ; CODE XREF: _main+5E j
>>>>>>>                 mov     ecx, [rax+r8*4]
>>>>>>>                 lea     r9d, [rcx+1]
>>>>>>>                 mov     [rax+r8*4], r9d
>>>>>>>                 cmp     ecx, r8d
>>>>>>>                 jge     loc_100000F0E
>>>>>>>                 lea     r12, [rax+4]
>>>>>>>                 xor     r14d, r14d
>>>>>>>                 db      2Eh
>>>>>>>                 nop     word ptr [rax+rax+00000000h]
>>>>>>>
>>>>>>> loc_100000E20:                          ; CODE XREF: _main+216 j
>>>>>>>                 test    r15d, r15d
>>>>>>>                 setle   cl
>>>>>>>                 cmp     r15d, 2
>>>>>>>                 jl      short loc_100000E90
>>>>>>>                 test    cl, cl
>>>>>>>                 mov     r13d, 0
>>>>>>>                 mov     r11, r12
>>>>>>>                 mov     r10d, 1
>>>>>>>                 jnz     short loc_100000E90
>>>>>>>
>>>>>>> loc_100000E3F:                          ; CODE XREF: _main+1F0 j
>>>>>>>                 mov     edi, [rax+r13*4]
>>>>>>>                 mov     edx, 0FFFFFFFFh
>>>>>>>                 mov     ecx, 1
>>>>>>>                 mov     rsi, r11
>>>>>>>
>>>>>>> loc_100000E50:                          ; CODE XREF: _main+1E1 j
>>>>>>>                 xor     ebx, ebx
>>>>>>>                 mov     ebp, edi
>>>>>>>                 sub     ebp, [rsi]
>>>>>>>                 jz      short loc_100000E95
>>>>>>>                 cmp     edx, ebp
>>>>>>>                 jz      short loc_100000E95
>>>>>>>                 cmp     ecx, ebp
>>>>>>>                 jz      short loc_100000E95
>>>>>>>                 add     rsi, 4
>>>>>>>                 inc     rcx
>>>>>>>                 dec     edx
>>>>>>>                 mov     ebx, ecx
>>>>>>>                 add     ebx, r13d
>>>>>>>                 cmp     ebx, r15d
>>>>>>>                 jl      short loc_100000E50
>>>>>>>                 inc     r10
>>>>>>>                 add     r11, 4
>>>>>>>                 inc     r13
>>>>>>>                 cmp     r10d, r15d
>>>>>>>                 jl      short loc_100000E3F
>>>>>>>                 db      66h, 66h, 66h, 66h, 2Eh
>>>>>>>                 nop     word ptr [rax+rax+00000000h]
>>>>>>>
>>>>>>> loc_100000E90:                          ; CODE XREF: _main+19A j
>>>>>>>                                         ; _main+1AD j
>>>>>>>                 mov     ebx, 1
>>>>>>>
>>>>>>> loc_100000E95:                          ; CODE XREF: _main+1C6 j
>>>>>>>                                         ; _main+1CA j ...
>>>>>>>                 add     r14d, ebx
>>>>>>>                 cmp     r9d, r8d
>>>>>>>                 lea     ecx, [r9+1]
>>>>>>>                 mov     [rax+r8*4], ecx
>>>>>>>                 mov     r9d, ecx
>>>>>>>                 jl      loc_100000E20
>>>>>>>                 jmp     short loc_100000F0E
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000EAE:                          ; CODE XREF: _main+6B j
>>>>>>>                 add     r15d, 0FFFFFFFEh
>>>>>>>                 movsxd  rcx, r15d
>>>>>>>                 lea     rcx, [rax+rcx*4]
>>>>>>>                 xor     r14d, r14d
>>>>>>>                 jmp     short loc_100000EC6
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>                 align 20h
>>>>>>>
>>>>>>> loc_100000EC0:                          ; CODE XREF: _main+247 j
>>>>>>>                                         ; _main+27C j
>>>>>>>                 inc     r14d
>>>>>>>                 mov     r11d, ebp
>>>>>>>
>>>>>>> loc_100000EC6:                          ; CODE XREF: _main+22C j
>>>>>>>                 lea     ebp, [r11+1]
>>>>>>>                 mov     [rax+r8*4], ebp
>>>>>>>                 cmp     r11d, r8d
>>>>>>>                 mov     rsi, rcx
>>>>>>>                 mov     edi, r15d
>>>>>>>                 jl      short loc_100000EC0
>>>>>>>                 nop     dword ptr [rax+00000000h]
>>>>>>>
>>>>>>> loc_100000EE0:                          ; CODE XREF: _main+26A j
>>>>>>>                 mov     ebp, [rsi]
>>>>>>>                 lea     ebx, [rbp+1]
>>>>>>>                 mov     [rsi], ebx
>>>>>>>                 cmp     ebp, edx
>>>>>>>                 jl      short loc_100000EFE
>>>>>>>                 mov     dword ptr [rsi], 0
>>>>>>>                 add     rsi, 0FFFFFFFFFFFFFFFCh
>>>>>>>                 test    edi, edi
>>>>>>>                 lea     edi, [rdi-1]
>>>>>>>                 jg      short loc_100000EE0
>>>>>>>                 jmp     short loc_100000F0E
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000EFE:                          ; CODE XREF: _main+259 j
>>>>>>>                 test    edi, edi
>>>>>>>                 js      short loc_100000F0E
>>>>>>>                 mov     dword ptr [rax+r8*4], 0
>>>>>>>                 xor     ebp, ebp
>>>>>>>                 jmp     short loc_100000EC0
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000F0E:                          ; CODE XREF: _main+DC j
>>>>>>>                                         ; _main+E3 j ...
>>>>>>>                 mov     rdi, rax        ; void *
>>>>>>>                 call    _free
>>>>>>>                 lea     rdi, aDSolutions ; "%d solutions\n"
>>>>>>>                 xor     ebx, ebx
>>>>>>>                 xor     eax, eax
>>>>>>>                 mov     esi, r14d
>>>>>>>                 call    _printf
>>>>>>>
>>>>>>> loc_100000F29:                          ; CODE XREF: _main+16 j
>>>>>>>                 mov     eax, ebx
>>>>>>>                 add     rsp, 18h
>>>>>>>                 pop     rbx
>>>>>>>                 pop     r12
>>>>>>>                 pop     r13
>>>>>>>                 pop     r14
>>>>>>>                 pop     r15
>>>>>>>                 pop     rbp
>>>>>>>                 retn
>>>>>>> _main           endp
>>>>>>> ```
>>>>>>>
>>>>>>> clang 3.6's result:
>>>>>>>
>>>>>>> ```
>>>>>>>                 public _main
>>>>>>> _main           proc near
>>>>>>>
>>>>>>> var_60          = qword ptr -60h
>>>>>>> var_58          = qword ptr -58h
>>>>>>> var_50          = qword ptr -50h
>>>>>>> var_48          = qword ptr -48h
>>>>>>> var_40          = qword ptr -40h
>>>>>>> var_38          = qword ptr -38h
>>>>>>>
>>>>>>>                 push    rbp
>>>>>>>                 push    r15
>>>>>>>                 push    r14
>>>>>>>                 push    r13
>>>>>>>                 push    r12
>>>>>>>                 push    rbx
>>>>>>>                 sub     rsp, 38h
>>>>>>>                 mov     ebx, 0FFFFFFFFh
>>>>>>>                 cmp     edi, 2
>>>>>>>                 jnz     loc_100000F23
>>>>>>>                 mov     rbx, offset __mh_execute_header
>>>>>>>                 mov     rdi, [rsi+8]    ; char *
>>>>>>>                 xor     r13d, r13d
>>>>>>>                 xor     esi, esi        ; char **
>>>>>>>                 mov     edx, 0Ah        ; int
>>>>>>>                 call    _strtol
>>>>>>>                 mov     r14, rax
>>>>>>>                 shl     rax, 20h
>>>>>>>                 mov     [rsp+68h+var_38], rax
>>>>>>>                 lea     rsi, [rax+rbx]
>>>>>>>                 sar     rsi, 20h        ; size_t
>>>>>>>                 mov     edi, 4          ; size_t
>>>>>>>                 call    _calloc
>>>>>>>                 lea     r11d, [r14-1]
>>>>>>>                 movsxd  r12, r11d
>>>>>>>                 mov     [rsp+68h+var_40], r12
>>>>>>>                 movsxd  rcx, r14d
>>>>>>>                 mov     [rsp+68h+var_50], rcx
>>>>>>>                 add     ecx, 0FFFFFFFEh
>>>>>>>                 js      loc_100000E1A
>>>>>>>                 mov     ecx, r14d
>>>>>>>                 add     ecx, 0FFFFFFFEh
>>>>>>>                 movsxd  rcx, ecx
>>>>>>>                 inc     rcx
>>>>>>>                 mov     [rsp+68h+var_58], rcx
>>>>>>>                 mov     rcx, rax
>>>>>>>                 add     rcx, 4
>>>>>>>                 mov     [rsp+68h+var_60], rcx
>>>>>>>                 xor     ebp, ebp
>>>>>>>                 jmp     short loc_100000D17
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>                 align 10h
>>>>>>>
>>>>>>> loc_100000D10:                          ; CODE XREF: _main+15B j
>>>>>>>                                         ; _main+163 j ...
>>>>>>>                 mov     rbp, [rsp+68h+var_48]
>>>>>>>                 add     ebp, edi
>>>>>>>
>>>>>>> loc_100000D17:                          ; CODE XREF: _main+93 j
>>>>>>>                 cmp     r13d, r11d
>>>>>>>                 lea     edx, [r13+1]
>>>>>>>                 mov     [rax+r12*4], edx
>>>>>>>                 mov     rcx, [rsp+68h+var_58]
>>>>>>>                 mov     r13d, edx
>>>>>>>                 jl      short loc_100000D6B
>>>>>>>                 nop     dword ptr [rax+00h]
>>>>>>>
>>>>>>> loc_100000D30:                          ; CODE XREF: _main+DE j
>>>>>>>                 mov     edx, [rax+rcx*4-4]
>>>>>>>                 lea     esi, [rdx+1]
>>>>>>>                 mov     [rax+rcx*4-4], esi
>>>>>>>                 cmp     edx, r11d
>>>>>>>                 jl      short loc_100000D60
>>>>>>>                 mov     dword ptr [rax+rcx*4-4], 0
>>>>>>>                 dec     rcx
>>>>>>>                 test    rcx, rcx
>>>>>>>                 jg      short loc_100000D30
>>>>>>>                 jmp     loc_100000F09
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>                 align 20h
>>>>>>>
>>>>>>> loc_100000D60:                          ; CODE XREF: _main+CE j
>>>>>>>                 mov     dword ptr [rax+r12*4], 0
>>>>>>>                 xor     r13d, r13d
>>>>>>>
>>>>>>> loc_100000D6B:                          ; CODE XREF: _main+BA j
>>>>>>>                 mov     [rsp+68h+var_48], rbp
>>>>>>>                 test    r14d, r14d
>>>>>>>                 setle   cl
>>>>>>>                 mov     rdx, offset __mh_execute_header
>>>>>>>                 lea     rdx, [rdx+1]
>>>>>>>                 cmp     [rsp+68h+var_38], rdx
>>>>>>>                 jl      loc_100000E10
>>>>>>>                 test    cl, cl
>>>>>>>                 mov     edx, 0
>>>>>>>                 mov     r10, [rsp+68h+var_60]
>>>>>>>                 mov     r9d, 1
>>>>>>>                 jnz     short loc_100000E10
>>>>>>>
>>>>>>> loc_100000DA3:                          ; CODE XREF: _main+195 j
>>>>>>>                 mov     esi, [rax+rdx*4]
>>>>>>>                 mov     r15d, 0FFFFFFFFh
>>>>>>>                 mov     r8d, 1
>>>>>>>                 mov     rcx, r10
>>>>>>>                 db      66h, 66h, 2Eh
>>>>>>>                 nop     dword ptr [rax+rax+00000000h]
>>>>>>>
>>>>>>> loc_100000DC0:                          ; CODE XREF: _main+184 j
>>>>>>>                 mov     ebx, [rcx]
>>>>>>>                 mov     ebp, esi
>>>>>>>                 sub     ebp, ebx
>>>>>>>                 xor     edi, edi
>>>>>>>                 cmp     r8d, ebp
>>>>>>>                 jz      loc_100000D10
>>>>>>>                 cmp     esi, ebx
>>>>>>>                 jz      loc_100000D10
>>>>>>>                 cmp     r15d, ebp
>>>>>>>                 jz      loc_100000D10
>>>>>>>                 add     rcx, 4
>>>>>>>                 inc     r8
>>>>>>>                 dec     r15d
>>>>>>>                 mov     edi, r8d
>>>>>>>                 add     edi, edx
>>>>>>>                 cmp     edi, r14d
>>>>>>>                 jl      short loc_100000DC0
>>>>>>>                 inc     r9
>>>>>>>                 add     r10, 4
>>>>>>>                 inc     rdx
>>>>>>>                 cmp     r9, [rsp+68h+var_50]
>>>>>>>                 jl      short loc_100000DA3
>>>>>>>                 nop     word ptr [rax+rax+00000000h]
>>>>>>>
>>>>>>> loc_100000E10:                          ; CODE XREF: _main+119 j
>>>>>>>                                         ; _main+131 j
>>>>>>>                 mov     edi, 1
>>>>>>>                 jmp     loc_100000D10
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000E1A:                          ; CODE XREF: _main+6E j
>>>>>>>                 test    r14d, r14d
>>>>>>>                 jle     loc_100000F00
>>>>>>>                 mov     dword ptr [rax+r12*4], 1
>>>>>>>                 xor     ebp, ebp
>>>>>>>                 cmp     r14d, 2
>>>>>>>                 jl      loc_100000F09
>>>>>>>                 mov     rcx, rax
>>>>>>>                 add     rcx, 4
>>>>>>>                 mov     [rsp+68h+var_48], rcx
>>>>>>>                 xor     ebp, ebp
>>>>>>>                 mov     r15d, 1
>>>>>>>                 nop     dword ptr [rax+rax+00h]
>>>>>>>
>>>>>>> loc_100000E50:                          ; CODE XREF: _main+288 j
>>>>>>>                 mov     rbx, rbp
>>>>>>>                 mov     rcx, offset __mh_execute_header
>>>>>>>                 cmp     [rsp+68h+var_38], rcx
>>>>>>>                 mov     edx, 0
>>>>>>>                 mov     r13, [rsp+68h+var_48]
>>>>>>>                 mov     r8d, 1
>>>>>>>                 mov     r9d, 1
>>>>>>>                 jle     short loc_100000EE0
>>>>>>>
>>>>>>> loc_100000E7A:                          ; CODE XREF: _main+25A j
>>>>>>>                 mov     r12d, [rax+rdx*4]
>>>>>>>                 mov     edi, 0FFFFFFFFh
>>>>>>>                 mov     ecx, 1
>>>>>>>                 mov     rsi, r13
>>>>>>>                 nop     dword ptr [rax+rax+00h]
>>>>>>>
>>>>>>> loc_100000E90:                          ; CODE XREF: _main+249 j
>>>>>>>                 mov     r10d, [rsi]
>>>>>>>                 mov     ebp, r12d
>>>>>>>                 sub     ebp, r10d
>>>>>>>                 xor     r9d, r9d
>>>>>>>                 cmp     ecx, ebp
>>>>>>>                 jz      short loc_100000EE0
>>>>>>>                 cmp     r12d, r10d
>>>>>>>                 jz      short loc_100000EE0
>>>>>>>                 cmp     edi, ebp
>>>>>>>                 jz      short loc_100000EE0
>>>>>>>                 add     rsi, 4
>>>>>>>                 inc     rcx
>>>>>>>                 dec     edi
>>>>>>>                 mov     ebp, ecx
>>>>>>>                 add     ebp, edx
>>>>>>>                 cmp     ebp, r14d
>>>>>>>                 jl      short loc_100000E90
>>>>>>>                 inc     r8
>>>>>>>                 add     r13, 4
>>>>>>>                 inc     rdx
>>>>>>>                 cmp     r8, [rsp+68h+var_50]
>>>>>>>                 jl      short loc_100000E7A
>>>>>>>                 mov     r9d, 1
>>>>>>>                 db      66h, 66h, 66h, 66h, 2Eh
>>>>>>>                 nop     word ptr [rax+rax+00000000h]
>>>>>>>
>>>>>>> loc_100000EE0:                          ; CODE XREF: _main+208 j
>>>>>>>                                         ; _main+22E j ...
>>>>>>>                 mov     rbp, rbx
>>>>>>>                 add     ebp, r9d
>>>>>>>                 cmp     r15d, r11d
>>>>>>>                 lea     ecx, [r15+1]
>>>>>>>                 mov     rdx, [rsp+68h+var_40]
>>>>>>>                 mov     [rax+rdx*4], ecx
>>>>>>>                 mov     r15d, ecx
>>>>>>>                 jl      loc_100000E50
>>>>>>>                 jmp     short loc_100000F09
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000F00:                          ; CODE XREF: _main+1AD j
>>>>>>>                 xor     ebp, ebp
>>>>>>>                 test    r11d, r11d
>>>>>>>                 cmovns  ebp, r11d
>>>>>>>
>>>>>>> loc_100000F09:                          ; CODE XREF: _main+E0 j
>>>>>>>                                         ; _main+1C1 j ...
>>>>>>>                 mov     rdi, rax        ; void *
>>>>>>>                 call    _free
>>>>>>>                 lea     rdi, aDSolutions ; "%d solutions\n"
>>>>>>>                 xor     ebx, ebx
>>>>>>>                 xor     eax, eax
>>>>>>>                 mov     esi, ebp
>>>>>>>                 call    _printf
>>>>>>>
>>>>>>> loc_100000F23:                          ; CODE XREF: _main+16 j
>>>>>>>                 mov     eax, ebx
>>>>>>>                 add     rsp, 38h
>>>>>>>                 pop     rbx
>>>>>>>                 pop     r12
>>>>>>>                 pop     r13
>>>>>>>                 pop     r14
>>>>>>>                 pop     r15
>>>>>>>                 pop     rbp
>>>>>>>                 retn
>>>>>>> _main           endp
>>>>>>> ```
>>>>>>>
>>>>>>> gcc-4.9.2's result:
>>>>>>> ```
>>>>>>>
>>>>>>> _main           proc near
>>>>>>>
>>>>>>> var_48          = qword ptr -48h
>>>>>>> var_40          = dword ptr -40h
>>>>>>> var_3C          = dword ptr -3Ch
>>>>>>>
>>>>>>>                 cmp     edi, 2
>>>>>>>                 jz      short loc_100000D69
>>>>>>>                 or      eax, 0FFFFFFFFh
>>>>>>>                 retn
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000D69:                          ; CODE XREF: _main+3 j
>>>>>>>                 push    r15
>>>>>>>                 mov     edx, 0Ah        ; int
>>>>>>>                 push    r14
>>>>>>>                 push    r13
>>>>>>>                 push    r12
>>>>>>>                 push    rbp
>>>>>>>                 push    rbx
>>>>>>>                 sub     rsp, 18h
>>>>>>>                 mov     rdi, [rsi+8]    ; char *
>>>>>>>                 xor     esi, esi        ; char **
>>>>>>>                 call    _strtol
>>>>>>>                 mov     edi, 4          ; size_t
>>>>>>>                 lea     esi, [rax+1]
>>>>>>>                 mov     r14, rax
>>>>>>>                 mov     ebx, eax
>>>>>>>                 lea     r15d, [r14-2]
>>>>>>>                 movsxd  rsi, esi        ; size_t
>>>>>>>                 call    _calloc
>>>>>>>                 mov     [rsp+48h+var_3C], 0
>>>>>>>                 mov     rdi, rax        ; void *
>>>>>>>                 lea     eax, [r14-1]
>>>>>>>                 cdqe
>>>>>>>                 lea     r13, [rdi+rax*4]
>>>>>>>                 movsxd  rax, r15d
>>>>>>>                 mov     ebp, [r13+0]
>>>>>>>                 shl     rax, 2
>>>>>>>                 lea     r12, [rdi+rax]
>>>>>>>                 lea     rax, [rdi+rax-4]
>>>>>>>                 mov     [rsp+48h+var_48], rax
>>>>>>>                 mov     eax, r14d
>>>>>>>                 lea     r14d, [r14+1]
>>>>>>>                 nop     word ptr [rax+rax+00h]
>>>>>>>                 nop     word ptr [rax+rax+00h]
>>>>>>>
>>>>>>> loc_100000DE0:                          ; CODE XREF: _main+12B j
>>>>>>>                                         ; _main+155 j ...
>>>>>>>                 add     ebp, 1
>>>>>>>                 cmp     ebx, ebp
>>>>>>>                 mov     [r13+0], ebp
>>>>>>>                 jg      short loc_100000E62
>>>>>>>                 test    r15d, r15d
>>>>>>>                 js      short loc_100000E33
>>>>>>>                 mov     ecx, [r12]
>>>>>>>                 lea     edx, [rcx+1]
>>>>>>>                 cmp     ebx, edx
>>>>>>>                 mov     [r12], edx
>>>>>>>                 jg      short loc_100000E58
>>>>>>>                 mov     r8, r12
>>>>>>>                 mov     rcx, [rsp+48h+var_48]
>>>>>>>                 mov     esi, r15d
>>>>>>>                 jmp     short loc_100000E24
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>                 align 10h
>>>>>>>
>>>>>>> loc_100000E10:                          ; CODE XREF: _main+D1 j
>>>>>>>                 mov     edx, [rcx]
>>>>>>>                 sub     r8, 4
>>>>>>>                 sub     rcx, 4
>>>>>>>                 add     edx, 1
>>>>>>>                 mov     [rcx+4], edx
>>>>>>>                 cmp     ebx, edx
>>>>>>>                 jg      short loc_100000E58
>>>>>>>
>>>>>>> loc_100000E24:                          ; CODE XREF: _main+A9 j
>>>>>>>                 sub     esi, 1
>>>>>>>                 mov     dword ptr [r8], 0
>>>>>>>                 cmp     esi, 0FFFFFFFFh
>>>>>>>                 jnz     short loc_100000E10
>>>>>>>
>>>>>>> loc_100000E33:                          ; CODE XREF: _main+8E j
>>>>>>>                 call    _free
>>>>>>>                 mov     esi, [rsp+48h+var_3C]
>>>>>>>                 add     rsp, 18h
>>>>>>>                 xor     eax, eax
>>>>>>>                 pop     rbx
>>>>>>>                 lea     rdi, aDSolutions ; "%d solutions\n"
>>>>>>>                 pop     rbp
>>>>>>>                 pop     r12
>>>>>>>                 pop     r13
>>>>>>>                 pop     r14
>>>>>>>                 pop     r15
>>>>>>>                 jmp     _printf
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000E58:                          ; CODE XREF: _main+9D j
>>>>>>>                                         ; _main+C2 j
>>>>>>>                 mov     dword ptr [r13+0], 0
>>>>>>>                 xor     ebp, ebp
>>>>>>>
>>>>>>> loc_100000E62:                          ; CODE XREF: _main+89 j
>>>>>>>                 test    ebx, ebx
>>>>>>>                 jle     loc_100000EE6
>>>>>>>                 lea     r11, [rdi+8]
>>>>>>>                 xor     r10d, r10d
>>>>>>>
>>>>>>> loc_100000E71:                          ; CODE XREF: _main+184 j
>>>>>>>                 add     r10d, 1
>>>>>>>                 cmp     r10d, eax
>>>>>>>                 jz      short loc_100000EE6
>>>>>>>                 mov     r8d, [r11-8]
>>>>>>>                 mov     edx, r8d
>>>>>>>                 sub     edx, [r11-4]
>>>>>>>                 add     edx, 1
>>>>>>>                 cmp     edx, 2
>>>>>>>                 jbe     loc_100000DE0
>>>>>>>                 mov     r9d, r14d
>>>>>>>                 mov     rcx, r11
>>>>>>>                 mov     edx, 1
>>>>>>>                 mov     [rsp+48h+var_40], r10d
>>>>>>>                 sub     r9d, r10d
>>>>>>>                 jmp     short loc_100000ED3
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>                 align 10h
>>>>>>>
>>>>>>> loc_100000EB0:                          ; CODE XREF: _main+179 j
>>>>>>>                 mov     esi, r8d
>>>>>>>                 sub     esi, [rcx]
>>>>>>>                 jz      loc_100000DE0
>>>>>>>                 mov     r10d, esi
>>>>>>>                 add     rcx, 4
>>>>>>>                 add     r10d, edx
>>>>>>>                 jz      loc_100000DE0
>>>>>>>                 cmp     esi, edx
>>>>>>>                 jz      loc_100000DE0
>>>>>>>
>>>>>>> loc_100000ED3:                          ; CODE XREF: _main+144 j
>>>>>>>                 add     edx, 1
>>>>>>>                 cmp     edx, r9d
>>>>>>>                 jnz     short loc_100000EB0
>>>>>>>                 mov     r10d, [rsp+48h+var_40]
>>>>>>>                 add     r11, 4
>>>>>>>                 jmp     short loc_100000E71
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_100000EE6:                          ; CODE XREF: _main+104 j
>>>>>>>                                         ; _main+118 j
>>>>>>>                 add     [rsp+48h+var_3C], 1
>>>>>>>                 jmp     loc_100000DE0
>>>>>>> _main           endp
>>>>>>> ```
>>>>>>>
>>>>>>> MSVC 10.0's result:
>>>>>>>
>>>>>>> ```
>>>>>>>
>>>>>>> _main           proc near               ; CODE XREF: ___tmainCRTStartup+106 p
>>>>>>>
>>>>>>> var_80          = dword ptr -80h
>>>>>>> var_7C          = dword ptr -7Ch
>>>>>>> var_78          = dword ptr -78h
>>>>>>> var_74          = dword ptr -74h
>>>>>>> var_70          = dword ptr -70h
>>>>>>> var_6C          = dword ptr -6Ch
>>>>>>> var_68          = dword ptr -68h
>>>>>>> var_64          = dword ptr -64h
>>>>>>> var_60          = dword ptr -60h
>>>>>>> var_5C          = dword ptr -5Ch
>>>>>>> argc            = dword ptr  8
>>>>>>> argv            = dword ptr  0Ch
>>>>>>> envp            = dword ptr  10h
>>>>>>>
>>>>>>>                 push    ebp
>>>>>>>                 mov     ebp, esp
>>>>>>>                 and     esp, 0FFFFFF80h
>>>>>>>                 push    esi
>>>>>>>                 push    edi
>>>>>>>                 push    ebx
>>>>>>>                 sub     esp, 74h
>>>>>>>                 push    3
>>>>>>>                 call    sub_4080F0
>>>>>>>                 add     esp, 4
>>>>>>>                 stmxcsr [esp+80h+var_80]
>>>>>>>                 or      [esp+80h+var_80], 8000h
>>>>>>>                 ldmxcsr [esp+80h+var_80]
>>>>>>>                 cmp     [ebp+argc], 2
>>>>>>>                 jz      short loc_40103A
>>>>>>>                 mov     eax, 0FFFFFFFFh
>>>>>>>                 add     esp, 74h
>>>>>>>                 pop     ebx
>>>>>>>                 pop     edi
>>>>>>>                 pop     esi
>>>>>>>                 mov     esp, ebp
>>>>>>>                 pop     ebp
>>>>>>>                 retn
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_40103A:                             ; CODE XREF: _main+29 j
>>>>>>>                 call    ds:GetTickCount
>>>>>>>                 mov     esi, eax
>>>>>>>                 mov     eax, [ebp+argv]
>>>>>>>                 push    dword ptr [eax+4] ; char *
>>>>>>>                 call    _atoi
>>>>>>>                 mov     edi, eax
>>>>>>>                 lea     eax, [edi+1]
>>>>>>>                 push    eax             ; size_t
>>>>>>>                 push    4               ; size_t
>>>>>>>                 call    _calloc
>>>>>>>                 add     esp, 0Ch
>>>>>>>                 mov     ecx, [eax+edi*4-4]
>>>>>>>                 lea     edx, [edi-1]
>>>>>>>                 mov     [esp+80h+var_6C], ecx
>>>>>>>                 xor     ebx, ebx
>>>>>>>                 mov     [esp+80h+var_7C], ebx
>>>>>>>                 lea     ecx, [eax+edi*4]
>>>>>>>                 mov     [esp+80h+var_74], ecx
>>>>>>>                 lea     ecx, [edi-2]
>>>>>>>                 mov     [esp+80h+var_70], ecx
>>>>>>>                 mov     [esp+80h+var_60], edx
>>>>>>>                 mov     [esp+80h+var_80], esi
>>>>>>>                 mov     ecx, [esp+80h+var_6C]
>>>>>>>
>>>>>>> loc_401087:                             ; CODE XREF: _main+142 j
>>>>>>>                                         ; _main+193 j
>>>>>>>                 mov     edx, [esp+80h+var_60]
>>>>>>>                 inc     ecx
>>>>>>>                 mov     [eax+edi*4-4], ecx
>>>>>>>                 cmp     edi, [eax+edx*4]
>>>>>>>                 jg      short loc_4010DC
>>>>>>>                 mov     esi, [esp+80h+var_70]
>>>>>>>                 test    esi, esi
>>>>>>>                 js      short loc_4010CE
>>>>>>>                 xor     edx, edx
>>>>>>>                 mov     [esp+80h+var_78], eax
>>>>>>>                 xor     ebx, ebx
>>>>>>>                 mov     eax, [esp+80h+var_74]
>>>>>>>
>>>>>>> loc_4010A9:                             ; CODE XREF: _main+C8 j
>>>>>>>                 mov     ecx, [eax+ebx*4-8]
>>>>>>>                 inc     ecx
>>>>>>>                 cmp     ecx, edi
>>>>>>>                 jl      loc_40117A
>>>>>>>                 inc     edx
>>>>>>>                 lea     esi, [ebx+edi-3]
>>>>>>>                 mov     dword ptr [eax+ebx*4-8], 0
>>>>>>>                 dec     ebx
>>>>>>>                 cmp     edx, [esp+80h+var_60]
>>>>>>>                 jb      short loc_4010A9
>>>>>>>                 mov     eax, [esp+80h+var_78]
>>>>>>>
>>>>>>> loc_4010CE:                             ; CODE XREF: _main+9B j
>>>>>>>                                         ; _main+186 j
>>>>>>>                 test    esi, esi
>>>>>>>                 jl      short loc_401147
>>>>>>>                 mov     dword ptr [eax+edi*4-4], 0
>>>>>>>                 xor     ecx, ecx
>>>>>>>
>>>>>>> loc_4010DC:                             ; CODE XREF: _main+93 j
>>>>>>>                 test    edi, edi
>>>>>>>                 jle     short loc_40113E
>>>>>>>                 mov     [esp+80h+var_6C], ecx
>>>>>>>                 xor     edx, edx
>>>>>>>                 mov     [esp+80h+var_5C], edi
>>>>>>>
>>>>>>> loc_4010EA:                             ; CODE XREF: _main+132 j
>>>>>>>                 lea     ecx, [edx+1]
>>>>>>>                 mov     ebx, ecx
>>>>>>>                 mov     esi, ebx
>>>>>>>                 cmp     ecx, [esp+80h+var_5C]
>>>>>>>                 jge     short loc_401130
>>>>>>>                 mov     edx, [eax+edx*4]
>>>>>>>                 mov     edi, 1
>>>>>>>                 mov     [esp+80h+var_64], esi
>>>>>>>                 mov     [esp+80h+var_68], ecx
>>>>>>>
>>>>>>> loc_401107:                             ; CODE XREF: _main+122 j
>>>>>>>                 mov     esi, [eax+ebx*4]
>>>>>>>                 cmp     edx, esi
>>>>>>>                 jz      short loc_40118B
>>>>>>>                 sub     esi, edx
>>>>>>>                 mov     ecx, esi
>>>>>>>                 neg     ecx
>>>>>>>                 cmp     edi, ecx
>>>>>>>                 jz      short loc_40118B
>>>>>>>                 cmp     esi, edi
>>>>>>>                 jz      short loc_40118B
>>>>>>>                 inc     ebx
>>>>>>>                 inc     edi
>>>>>>>                 cmp     ebx, [esp+80h+var_5C]
>>>>>>>                 jl      short loc_401107
>>>>>>>                 mov     ecx, [esp+80h+var_68]
>>>>>>>                 mov     esi, [esp+80h+var_64]
>>>>>>>                 cmp     ecx, [esp+80h+var_5C]
>>>>>>>
>>>>>>> loc_401130:                             ; CODE XREF: _main+F5 j
>>>>>>>                 mov     edx, esi
>>>>>>>                 jl      short loc_4010EA
>>>>>>>                 xchg    ax, ax
>>>>>>>                 mov     ecx, [esp+80h+var_6C]
>>>>>>>                 mov     edi, [esp+80h+var_5C]
>>>>>>>
>>>>>>> loc_40113E:                             ; CODE XREF: _main+DE j
>>>>>>>                 inc     [esp+80h+var_7C]
>>>>>>>                 jmp     loc_401087
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_401147:                             ; CODE XREF: _main+D0 j
>>>>>>>                 mov     ebx, [esp+80h+var_7C]
>>>>>>>                 mov     esi, [esp+80h+var_80]
>>>>>>>                 push    eax             ; void *
>>>>>>>                 call    _free
>>>>>>>                 add     esp, 4
>>>>>>>                 call    ds:GetTickCount
>>>>>>>                 sub     eax, esi
>>>>>>>                 push    eax
>>>>>>>                 push    ebx
>>>>>>>                 push    offset aDSolutionsInDM ; "%d solutions in %d msecs.\n"
>>>>>>>                 call    _printf
>>>>>>>                 xor     eax, eax
>>>>>>>                 add     esp, 80h
>>>>>>>                 pop     ebx
>>>>>>>                 pop     edi
>>>>>>>                 pop     esi
>>>>>>>                 mov     esp, ebp
>>>>>>>                 pop     ebp
>>>>>>>                 retn
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_40117A:                             ; CODE XREF: _main+B0 j
>>>>>>>                 mov     edx, [esp+80h+var_74]
>>>>>>>                 mov     eax, [esp+80h+var_78]
>>>>>>>                 mov     [edx+ebx*4-8], ecx
>>>>>>>                 jmp     loc_4010CE
>>>>>>> ; ---------------------------------------------------------------------------
>>>>>>>
>>>>>>> loc_40118B:                             ; CODE XREF: _main+10C j
>>>>>>>                                         ; _main+116 j ...
>>>>>>>                 mov     ecx, [esp+80h+var_6C]
>>>>>>>                 mov     edi, [esp+80h+var_5C]
>>>>>>>                 jmp     loc_401087
>>>>>>> _main           endp
>>>>>>> ```
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



More information about the llvm-dev mailing list