<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Possible bug in lowering of conditional branch"
   href="https://bugs.llvm.org/show_bug.cgi?id=35662">35662</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Possible bug in lowering of conditional branch
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: SystemZ
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>paulsson@linux.vnet.ibm.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=19551" name="attach_19551" title="reduced testcase">attachment 19551</a> <a href="attachment.cgi?id=19551&action=edit" title="reduced testcase">[details]</a></span>
reduced testcase

Csmith generated a program, which after reduction should print a certain
checksum according to gcc -O0/-O1 and clang, except when passing -mllvm
-disable-basicaa to clang.

I found that this disappeared when disabling the dag-combiner. I reduced it
further with bugpoint as long as the dag-combiner gave a different checksum.

In this final minimal .ll program (attached), I found that the wrong checksum
related to one DAG:

Initial selection DAG: %bb.12 'main:'
SelectionDAG has 36 nodes:
  t0: ch = EntryToken
  t3: i64 = Constant<0>
    t5: ch = store<ST4[@f](tbaa=<0x16dd5768>)> t0, Constant:i32<8>,
GlobalAddress:i64<i32* @f> 0, undef:i64
  t7: i32,ch = load<LD4[@h](tbaa=<0x16dd5768>)(dereferenceable)> t5,
GlobalAddress:i64<i32* @h> 0, undef:i64
    t9: ch = store<ST4[@e](tbaa=<0x16dd5768>)> t7:1, t7, GlobalAddress:i64<i32*
@e> 0, undef:i64
  t11: i32,ch = load<LD4[@i](tbaa=<0x16dd5768>)(dereferenceable)> t9,
GlobalAddress:i64<i32* @i> 0, undef:i64
      t14: i1 = setcc t11, Constant:i32<1>, setlt:ch
    t15: i32 = zero_extend t14
  t17: ch = store<ST4[@d](tbaa=<0x16dd5768>)> t11:1, t15,
GlobalAddress:i64<i32* @d> 0, undef:i64
  t23: ch = ValueType:i32
        t28: ch = CopyToReg t0, Register:i32 %88, Constant:i32<0>
      t31: ch = TokenFactor t28, t17
          t22: i32,ch = CopyFromReg t0, Register:i32 %0
            t18: i32,ch = load<LD4[@f](tbaa=<0x16dd5768>)(dereferenceable)>
t17, GlobalAddress:i64<i32* @f> 0, undef:i64
          t20: i32 = sdiv t18, Constant:i32<64>
        t25: i1 = setcc t22, t20, setult:ch
      t30: i1 = xor t25, Constant:i1<-1>
    t33: ch = brcond t31, t30, BasicBlock:ch< 0x16ec2e40>
  t35: ch = br t33, BasicBlock:ch< 0x16ec2d80>

One important observation is that in the good program, the jump across the
block that loads the immediate 8 (wrong output) is always made (t33).

In a *correct* compilation, the DAG gets transformed into:

Combining: t20: i32 = sdiv t18, Constant:i32<64>
 ... into: t42: i32 = sra t40, Constant:i64<6>

Replacing.1 t18: i32,ch = load<LD4[@f](tbaa=<0x464825f8>)(dereferenceable)>
t17, GlobalAddress:i64<i32* @f> 0, undef:i64
With: t43: i32,ch = load<LD4[@f](tbaa=<0x464825f8>)(dereferenceable)> t5,
GlobalAddress:i64<i32* @f> 0, undef:i64
 and 1 other values

Replacing.1 t43: i32,ch = load<LD4[@f](tbaa=<0x464825f8>)(dereferenceable)> t5,
GlobalAddress:i64<i32* @f> 0, undef:i64
With: t1: i32 = Constant<8>
 and 1 other values

Combining: t37: i32 = sra Constant:i32<8>, Constant:i64<31>
 ... into: t26: i32 = Constant<0>

Combining: t20: i32 = sdiv t18, Constant:i32<64>
 ... into: t42: i32 = sra t40, Constant:i64<6>
Combining: t42: i32 = sra Constant:i32<8>, Constant:i64<6>
 ... into: t26: i32 = Constant<0>

Optimized lowered selection DAG: %bb.12 'main:'
SelectionDAG has 26 nodes:
  t0: ch = EntryToken
  t50: i32,ch = load<LD4[@i](tbaa=<0x464825f8>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @i> 0, undef:i64
  t55: i32,ch = load<LD4[@h](tbaa=<0x464825f8>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @h> 0, undef:i64
        t28: ch = CopyToReg t0, Register:i32 %88, Constant:i32<0>
            t14: i1 = setcc t50, Constant:i32<1>, setlt:ch
          t15: i32 = zero_extend t14
        t46: ch = store<ST4[@d](tbaa=<0x464825f8>)> t0, t15,
GlobalAddress:i64<i32* @d> 0, undef:i64
        t48: ch = store<ST4[@e](tbaa=<0x464825f8>)> t0, t55,
GlobalAddress:i64<i32* @e> 0, undef:i64
        t5: ch = store<ST4[@f](tbaa=<0x464825f8>)> t0, Constant:i32<8>,
GlobalAddress:i64<i32* @f> 0, undef:i64
      t57: ch = TokenFactor t28, t46, t50:1, t48, t5, t55:1
    t33: ch = brcond t57, Constant:i1<-1>, BasicBlock:ch< 0x4656eda0>
  t35: ch = br t33, BasicBlock:ch< 0x4656ece0>

After constant propagations, the t33 brcond condition is always true.

In an *incorrect* compilation:

Combining: t30: i1 = xor t25, Constant:i1<-1>
 ... into: t37: i1 = setcc t22, t20, setuge:ch

Combining: t33: ch = brcond t31, t37, BasicBlock:ch< 0x16ec2e40>
 ... into: t38: ch = br_cc t31, setuge:ch, t22, t20, BasicBlock:ch< 0x16ec2e40>

SelectionDAG has 28 nodes:
  t0: ch = EntryToken
  t52: i32,ch = load<LD4[@i](tbaa=<0x16dd5768>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @i> 0, undef:i64
  t56: i32,ch = load<LD4[@h](tbaa=<0x16dd5768>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @h> 0, undef:i64
        t28: ch = CopyToReg t0, Register:i32 %88, Constant:i32<0>
            t14: i1 = setcc t52, Constant:i32<1>, setlt:ch
          t15: i32 = zero_extend t14
        t48: ch = store<ST4[@d](tbaa=<0x16dd5768>)> t0, t15,
GlobalAddress:i64<i32* @d> 0, undef:i64
        t50: ch = store<ST4[@e](tbaa=<0x16dd5768>)> t0, t56,
GlobalAddress:i64<i32* @e> 0, undef:i64
        t5: ch = store<ST4[@f](tbaa=<0x16dd5768>)> t0, Constant:i32<8>,
GlobalAddress:i64<i32* @f> 0, undef:i64
      t58: ch = TokenFactor t28, t48, t52:1, t50, t5, t56:1
      t22: i32,ch = CopyFromReg t0, Register:i32 %0
    t38: ch = br_cc t58, setuge:ch, t22, Constant:i32<0>, BasicBlock:ch<
0x16ec2e40>
  t35: ch = br t38, BasicBlock:ch< 0x16ec2d80>

It is still obvious that the t38 br_cc will always execute, since t22 is always
UGE than 0. So the problem should be somewhere in the following steps relating
to this conditional branch.

Legalized selection DAG: %bb.12 'main:'
...
        t22: i32,ch = CopyFromReg t0, Register:i32 %0
      t61: glue = SystemZISD::ICMP t22, Constant:i32<0>, Constant:i32<1>
    t64: ch = SystemZISD::BR_CCMASK t58, Constant:i32<14>, Constant:i32<10>,
BasicBlock:ch< 0x16ec2e40>, t61
  t35: ch = br t64, BasicBlock:ch< 0x16ec2d80>


Selected selection DAG: %bb.12 'main:'
...
        t22: i32,ch = CopyFromReg t0, Register:i32 %0
      t61: i32,glue = CLFIMux t22, TargetConstant:i64<0>
    t64: ch = BRC TargetConstant:i32<14>, TargetConstant:i32<10>,
BasicBlock:ch< 0x16ec2e40>, t58, t61:1
  t35: ch = J BasicBlock:ch< 0x16ec2d80>, t64

----> 
*** MachineFunction at end of ISel ***
%bb.12: derived from LLVM BB %28
...
        CLFIMux %0, 0, implicit-def %cc; GRX32Bit:%0
        BRC 14, 10, %bb.14, implicit %cc
        J %bb.13
    Successors according to CFG: %bb.13(0x40000000 / 0x80000000 = 50.00%)
%bb.14(0x40000000 / 0x80000000 = 50.00%)

%bb.13: derived from LLVM BB %36   //////////////// 8 is wrong value!
    Predecessors according to CFG: %bb.12
        %95:gr32bit = LHIMux 8; GR32Bit:%95
        STRL %95, @b; mem:ST4[@b](tbaa=!2) GR32Bit:%95
    Successors according to CFG: %bb.14(?%)

%bb.14: derived from LLVM BB %37

----> %0 gets spilled, so

        renamable %r2l = L %r15d, 164, %noreg; mem:LD4[FixedStack1]
        CLFI killed renamable %r2l, 0, implicit-def %cc
        BRC 14, 10, %bb.22, implicit %cc


----> Load-and-test transformation

        renamable %r2l = LT %r15d, 164, %noreg, implicit-def %cc;
mem:LD4[FixedStack1]
        BRC 14, 10, %bb.22, implicit killed %cc

----> assembly output

        lt      %r2, 164(%r15)          # 4-byte Folded Reload
        lhi     %r1, 0
        jhe     .LBB0_22
# %bb.21:
        lhi     %r1, 8
....


Do you see anything that seems wrong here? Or is the problem somewhere else?

bin/clang -O0 -march=z13 ./tc_dagcomb_14.ll -o a.out; ./a.out
checksum = 0

bin/clang -O3 -march=z13 ./tc_dagcomb_14.ll -o a.out; ./a.out
checksum = 0

bin/clang -O3 -march=z13 ./tc_dagcomb_14.ll -o a.out -mllvm -disable-basicaa;
./a.out
checksum = 8</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>