[llvm-bugs] [Bug 35662] New: Possible bug in lowering of conditional branch
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Dec 14 08:29:06 PST 2017
https://bugs.llvm.org/show_bug.cgi?id=35662
Bug ID: 35662
Summary: Possible bug in lowering of conditional branch
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Backend: SystemZ
Assignee: unassignedbugs at nondot.org
Reporter: paulsson at linux.vnet.ibm.com
CC: llvm-bugs at lists.llvm.org
Created attachment 19551
--> https://bugs.llvm.org/attachment.cgi?id=19551&action=edit
reduced testcase
Csmith generated a program, which after reduction should print a certain
checksum according to gcc -O0/-O1 and clang, except when passing -mllvm
-disable-basicaa to clang.
I found that this disappeared when disabling the dag-combiner. I reduced it
further with bugpoint as long as the dag-combiner gave a different checksum.
In this final minimal .ll program (attached), I found that the wrong checksum
related to one DAG:
Initial selection DAG: %bb.12 'main:'
SelectionDAG has 36 nodes:
t0: ch = EntryToken
t3: i64 = Constant<0>
t5: ch = store<ST4[@f](tbaa=<0x16dd5768>)> t0, Constant:i32<8>,
GlobalAddress:i64<i32* @f> 0, undef:i64
t7: i32,ch = load<LD4[@h](tbaa=<0x16dd5768>)(dereferenceable)> t5,
GlobalAddress:i64<i32* @h> 0, undef:i64
t9: ch = store<ST4[@e](tbaa=<0x16dd5768>)> t7:1, t7, GlobalAddress:i64<i32*
@e> 0, undef:i64
t11: i32,ch = load<LD4[@i](tbaa=<0x16dd5768>)(dereferenceable)> t9,
GlobalAddress:i64<i32* @i> 0, undef:i64
t14: i1 = setcc t11, Constant:i32<1>, setlt:ch
t15: i32 = zero_extend t14
t17: ch = store<ST4[@d](tbaa=<0x16dd5768>)> t11:1, t15,
GlobalAddress:i64<i32* @d> 0, undef:i64
t23: ch = ValueType:i32
t28: ch = CopyToReg t0, Register:i32 %88, Constant:i32<0>
t31: ch = TokenFactor t28, t17
t22: i32,ch = CopyFromReg t0, Register:i32 %0
t18: i32,ch = load<LD4[@f](tbaa=<0x16dd5768>)(dereferenceable)>
t17, GlobalAddress:i64<i32* @f> 0, undef:i64
t20: i32 = sdiv t18, Constant:i32<64>
t25: i1 = setcc t22, t20, setult:ch
t30: i1 = xor t25, Constant:i1<-1>
t33: ch = brcond t31, t30, BasicBlock:ch< 0x16ec2e40>
t35: ch = br t33, BasicBlock:ch< 0x16ec2d80>
One important observation is that in the good program, the jump across the
block that loads the immediate 8 (wrong output) is always made (t33).
In a *correct* compilation, the DAG gets transformed into:
Combining: t20: i32 = sdiv t18, Constant:i32<64>
... into: t42: i32 = sra t40, Constant:i64<6>
Replacing.1 t18: i32,ch = load<LD4[@f](tbaa=<0x464825f8>)(dereferenceable)>
t17, GlobalAddress:i64<i32* @f> 0, undef:i64
With: t43: i32,ch = load<LD4[@f](tbaa=<0x464825f8>)(dereferenceable)> t5,
GlobalAddress:i64<i32* @f> 0, undef:i64
and 1 other values
Replacing.1 t43: i32,ch = load<LD4[@f](tbaa=<0x464825f8>)(dereferenceable)> t5,
GlobalAddress:i64<i32* @f> 0, undef:i64
With: t1: i32 = Constant<8>
and 1 other values
Combining: t37: i32 = sra Constant:i32<8>, Constant:i64<31>
... into: t26: i32 = Constant<0>
Combining: t20: i32 = sdiv t18, Constant:i32<64>
... into: t42: i32 = sra t40, Constant:i64<6>
Combining: t42: i32 = sra Constant:i32<8>, Constant:i64<6>
... into: t26: i32 = Constant<0>
Optimized lowered selection DAG: %bb.12 'main:'
SelectionDAG has 26 nodes:
t0: ch = EntryToken
t50: i32,ch = load<LD4[@i](tbaa=<0x464825f8>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @i> 0, undef:i64
t55: i32,ch = load<LD4[@h](tbaa=<0x464825f8>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @h> 0, undef:i64
t28: ch = CopyToReg t0, Register:i32 %88, Constant:i32<0>
t14: i1 = setcc t50, Constant:i32<1>, setlt:ch
t15: i32 = zero_extend t14
t46: ch = store<ST4[@d](tbaa=<0x464825f8>)> t0, t15,
GlobalAddress:i64<i32* @d> 0, undef:i64
t48: ch = store<ST4[@e](tbaa=<0x464825f8>)> t0, t55,
GlobalAddress:i64<i32* @e> 0, undef:i64
t5: ch = store<ST4[@f](tbaa=<0x464825f8>)> t0, Constant:i32<8>,
GlobalAddress:i64<i32* @f> 0, undef:i64
t57: ch = TokenFactor t28, t46, t50:1, t48, t5, t55:1
t33: ch = brcond t57, Constant:i1<-1>, BasicBlock:ch< 0x4656eda0>
t35: ch = br t33, BasicBlock:ch< 0x4656ece0>
After constant propagations, the t33 brcond condition is always true.
In an *incorrect* compilation:
Combining: t30: i1 = xor t25, Constant:i1<-1>
... into: t37: i1 = setcc t22, t20, setuge:ch
Combining: t33: ch = brcond t31, t37, BasicBlock:ch< 0x16ec2e40>
... into: t38: ch = br_cc t31, setuge:ch, t22, t20, BasicBlock:ch< 0x16ec2e40>
SelectionDAG has 28 nodes:
t0: ch = EntryToken
t52: i32,ch = load<LD4[@i](tbaa=<0x16dd5768>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @i> 0, undef:i64
t56: i32,ch = load<LD4[@h](tbaa=<0x16dd5768>)(dereferenceable)> t0,
GlobalAddress:i64<i32* @h> 0, undef:i64
t28: ch = CopyToReg t0, Register:i32 %88, Constant:i32<0>
t14: i1 = setcc t52, Constant:i32<1>, setlt:ch
t15: i32 = zero_extend t14
t48: ch = store<ST4[@d](tbaa=<0x16dd5768>)> t0, t15,
GlobalAddress:i64<i32* @d> 0, undef:i64
t50: ch = store<ST4[@e](tbaa=<0x16dd5768>)> t0, t56,
GlobalAddress:i64<i32* @e> 0, undef:i64
t5: ch = store<ST4[@f](tbaa=<0x16dd5768>)> t0, Constant:i32<8>,
GlobalAddress:i64<i32* @f> 0, undef:i64
t58: ch = TokenFactor t28, t48, t52:1, t50, t5, t56:1
t22: i32,ch = CopyFromReg t0, Register:i32 %0
t38: ch = br_cc t58, setuge:ch, t22, Constant:i32<0>, BasicBlock:ch<
0x16ec2e40>
t35: ch = br t38, BasicBlock:ch< 0x16ec2d80>
It is still obvious that the t38 br_cc will always execute, since t22 is always
UGE than 0. So the problem should be somewhere in the following steps relating
to this conditional branch.
Legalized selection DAG: %bb.12 'main:'
...
t22: i32,ch = CopyFromReg t0, Register:i32 %0
t61: glue = SystemZISD::ICMP t22, Constant:i32<0>, Constant:i32<1>
t64: ch = SystemZISD::BR_CCMASK t58, Constant:i32<14>, Constant:i32<10>,
BasicBlock:ch< 0x16ec2e40>, t61
t35: ch = br t64, BasicBlock:ch< 0x16ec2d80>
Selected selection DAG: %bb.12 'main:'
...
t22: i32,ch = CopyFromReg t0, Register:i32 %0
t61: i32,glue = CLFIMux t22, TargetConstant:i64<0>
t64: ch = BRC TargetConstant:i32<14>, TargetConstant:i32<10>,
BasicBlock:ch< 0x16ec2e40>, t58, t61:1
t35: ch = J BasicBlock:ch< 0x16ec2d80>, t64
---->
*** MachineFunction at end of ISel ***
%bb.12: derived from LLVM BB %28
...
CLFIMux %0, 0, implicit-def %cc; GRX32Bit:%0
BRC 14, 10, %bb.14, implicit %cc
J %bb.13
Successors according to CFG: %bb.13(0x40000000 / 0x80000000 = 50.00%)
%bb.14(0x40000000 / 0x80000000 = 50.00%)
%bb.13: derived from LLVM BB %36 //////////////// 8 is wrong value!
Predecessors according to CFG: %bb.12
%95:gr32bit = LHIMux 8; GR32Bit:%95
STRL %95, @b; mem:ST4[@b](tbaa=!2) GR32Bit:%95
Successors according to CFG: %bb.14(?%)
%bb.14: derived from LLVM BB %37
----> %0 gets spilled, so
renamable %r2l = L %r15d, 164, %noreg; mem:LD4[FixedStack1]
CLFI killed renamable %r2l, 0, implicit-def %cc
BRC 14, 10, %bb.22, implicit %cc
----> Load-and-test transformation
renamable %r2l = LT %r15d, 164, %noreg, implicit-def %cc;
mem:LD4[FixedStack1]
BRC 14, 10, %bb.22, implicit killed %cc
----> assembly output
lt %r2, 164(%r15) # 4-byte Folded Reload
lhi %r1, 0
jhe .LBB0_22
# %bb.21:
lhi %r1, 8
....
Do you see anything that seems wrong here? Or is the problem somewhere else?
bin/clang -O0 -march=z13 ./tc_dagcomb_14.ll -o a.out; ./a.out
checksum = 0
bin/clang -O3 -march=z13 ./tc_dagcomb_14.ll -o a.out; ./a.out
checksum = 0
bin/clang -O3 -march=z13 ./tc_dagcomb_14.ll -o a.out -mllvm -disable-basicaa;
./a.out
checksum = 8
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171214/81ae7e97/attachment.html>
More information about the llvm-bugs
mailing list