[llvm-bugs] [Bug 30950] New: CFG simplify sinking causes regression for a small testcase

Tue Nov 8 11:51:14 PST 2016

https://llvm.org/bugs/show_bug.cgi?id=30950

            Bug ID: 30950
           Summary: CFG simplify sinking causes regression for a small
                    testcase
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Transformation Utilities
          Assignee: unassignedbugs at nondot.org
          Reporter: wmi at google.com
                CC: james.molloy at arm.com, llvm-bugs at lists.llvm.org
    Classification: Unclassified

------------------------------------------------------------------------
r282453 | rsmith | 2016-09-26 16:49:47 -0700 (Mon, 26 Sep 2016) | 4 lines

P0145R3 (C++17 evaluation order tweaks): consistently emit the LHS of array
subscripting before the RHS, regardless of which is the base and which is the
index.

------------------------------------------------------------------------ 

--------------------- 1.c -----------------------
#define MAX(x, y)   (((x) > (y)) ? (x) : (y))
long foo(long *maxarray, long k, long size) {
  return MAX(maxarray[k], maxarray[k+size+1]);
}
-------------------------------------------------

For the testcase 1.c above, with and without r282453 llvm generates different
code.

code generated by r282452:
        .cfi_startproc
# BB#0:                                 # %entry
        movq    (%rdi,%rsi,8), %rcx
        addq    %rsi, %rdx
        movq    8(%rdi,%rdx,8), %rax
        cmpq    %rax, %rcx          
        cmovgeq %rcx, %rax  =====> select from the vals of two array accesses.
        retq
.Lfunc_end0:
        .size   foo, .Lfunc_end0-foo
        .cfi_endproc

code generated by r282453:
        .cfi_startproc
# BB#0:                                 # %entry
        movq    (%rdi,%rsi,8), %rax
        leaq    (%rsi,%rdx), %rcx
        leaq    1(%rsi,%rdx), %rdx
        cmpq    8(%rdi,%rcx,8), %rax
        cmovgq  %rsi, %rdx ====> select from the indexes of two array accesses.
        movq    (%rdi,%rdx,8), %rax  
        retq
.Lfunc_end0:
        .size   foo, .Lfunc_end0-foo
        .cfi_endproc

However, the cause of the change above is CFG simplify sinking. The algorithm
to decide InstructionsToSink in CFG simplify sinking depends on the instruction
sequence, i.e. during the reverse traversal of BB, once an unsinkable
instruction is seen, the traversal will be stopped and remaining sinkable
instruction will not be added into InstructionsToSink. That is why when layout
is changed by r282453, CFG simplify sinking may generate different code. 

How the different sinking causes the final code to be different:

Without r282453, CFG simplify sinking generates a select which will select from
the addresses of two array accesses: %arrayidx and %arrayidx2.

define i64 @foo(i64* %maxarray, i64 %k, i64 %size) local_unnamed_addr #0 {
entry:
  %arrayidx = getelementptr inbounds i64, i64* %maxarray, i64 %k
  %0 = load i64, i64* %arrayidx, align 8, !tbaa !1
  %add = add nsw i64 %k, %size
  %add1 = add nsw i64 %add, 1
  %arrayidx2 = getelementptr inbounds i64, i64* %maxarray, i64 %add1
  %1 = load i64, i64* %arrayidx2, align 8, !tbaa !1
  %cmp = icmp sgt i64 %0, %1
  %arrayidx.arrayidx2 = select i1 %cmp, i64* %arrayidx, i64* %arrayidx2
  %2 = load i64, i64* %arrayidx.arrayidx2, align 8, !tbaa !1
  ret i64 %2
}

For the IR above, Instcombine can change load(select %arrayidx, %arrayidx2) to
select(%0, %1) and we can remove the last load.

With r282453, CFG simplify sinking generates a select which will select from
the indexes of two array accesses: %k, %add1.

define i64 @foo(i64* %maxarray, i64 %k, i64 %size) local_unnamed_addr #0 {
entry:
  %arrayidx = getelementptr inbounds i64, i64* %maxarray, i64 %k
  %0 = load i64, i64* %arrayidx, align 8, !tbaa !1
  %add = add nsw i64 %k, %size
  %add1 = add nsw i64 %add, 1
  %arrayidx2 = getelementptr inbounds i64, i64* %maxarray, i64 %add1
  %1 = load i64, i64* %arrayidx2, align 8, !tbaa !1
  %cmp = icmp sgt i64 %0, %1
  %k.add1 = select i1 %cmp, i64 %k, i64 %add1
  %arrayidx6 = getelementptr inbounds i64, i64* %maxarray, i64 %k.add1
  %2 = load i64, i64* %arrayidx6, align 8, !tbaa !1
  ret i64 %2
}

For the IR above, Instcombine can not do the load(select) => select(load)
transformation because the select may be deeply embedded inside of an array
expr like load(...select...), so we cannot eliminate the last load.

In summary, two things to be addressed here:
1. It is better for cfg simplify sinking to generate consistent code for
unrelated changes.
2. Either limit cfg simplify sinking to stop sinking at load address, or extend
instcombine to handle the case load(...select(index1, index2)...) and transform
it to select(load1, load2).

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20161108/23eaa061/attachment.html>