[PATCH] D98905: [SystemZ] Reuse known zeros/ones after zero-extension of i1.
Jonas Paulsson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 18 16:02:53 PDT 2021
jonpa created this revision.
jonpa added a reviewer: uweigand.
Herald added a subscriber: hiraditya.
jonpa requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
This is an optimization for zero extensions of i1:s, which resulted from looking into the perl regression against GCC. I noticed a lot of LHI 0, LHI 1, LOC sequences, which gcc did not seem to have.
Basically, after an ICMP NE 0, or an ICMP EQ 1, those known constants are already the ones needed as the CC result, so there is no need to load them with LHI (LGHI).
The i32 cases where quite straightforward, but then doing the same for i64 was a bit more of an effort. Since we always return i32 from getSetCCResultType(), these cases needed different handling depending on the user. If an i64 use is needed, then I chose to promote the setcc result in combineZERO_EXTEND(). For the i32 user (of i64 comparison), the reused operand was instead truncated.
For the case of i64 user, a lot of llgfr:s remained without the handling in combineZERO_EXTEND():
Optimized legalized selection DAG: %bb.0 'prototype_p:bb'
SelectionDAG has 15 nodes:
t0: ch = EntryToken
t5: i64,ch = load<(load 8 from `%0** undef`)> t0, undef:i64, undef:i64
t25: i32 = truncate t5
t24: i32 = SystemZISD::ICMP t5, Constant:i64<0>, TargetConstant:i32<0>
t29: i32 = SystemZISD::SELECT_CCMASK Constant:i32<1>, t25, TargetConstant:i32<14>, TargetConstant:i32<6>, t24
t20: i64 = zero_extend t29
t12: ch,glue = CopyToReg t0, Register:i64 $r2d, t20
t13: ch = SystemZISD::RET_FLAG t12, Register:i64 $r2d, t12:1
->
ltg %r0, 0(%r1)
lochilh %r0, 1
llgfr %r2, %r0
The general effects on SPEC'17 output:
1. Just the i32/i32 cases of NE 0:
lhi : 225081 221486 -3595
lghi : 445509 444420 -1089
locghilh : 3717 2676 -1041
chsi : 57297 56385 -912
lt : 13672 14348 +676
lochilh : 8796 9424 +628
llgfr : 90010 90533 +523
risbgn : 137540 137980 +440
tmll : 53266 53693 +427
ltr : 6140 6550 +410
lgr : 849527 849890 +363
llc : 39671 39994 +323
locrlh : 1492 1807 +315
...
2. Also the i64 cases, compared to (1):
lghi : 444420 441828 -2592
lhi : 221486 219782 -1704
lr : 62223 62878 +655
cghsi : 32665 32175 -490
tmll : 53693 54181 +488
llgfr : 90533 90066 -467
ltg : 157760 158133 +373
risbgn : 137980 138334 +354
jne : 42684 42990 +306
lg : 982786 982931 +145
je : 335154 335281 +127
cije : 107363 107237 -126
lgfr : 91442 91565 +123
lgr : 849890 850006 +116
ltgr : 10951 11067 +116
...
3. Also EQ 1 (both i32 and i64), compared to (2):
lochilh : 9492 9787 +295
lhi : 219782 219567 -215
lochie : 14183 13975 -208
chi : 53350 53448 +98
chsi : 56337 56259 -78
lr : 62878 62950 +72
tmll : 54181 54243 +62
lghi : 441828 441770 -58
locghie : 7174 7116 -58
risbgn : 138334 138383 +49
...
In total, master <> (3):
lhi : 225081 219567 -5514
lghi : 445509 441770 -3739
locghilh : 3717 2673 -1044
chsi : 57297 56259 -1038
lochilh : 8796 9787 +991
tmll : 53266 54243 +977
risbgn : 137540 138383 +843
lr : 62152 62950 +798
lt : 13672 14343 +671
jne : 42430 43028 +598
cghsi : 32644 32170 -474
lgr : 849527 849994 +467
ltr : 6140 6557 +417
...
I see some more LR:s, which I think is when the reused constant also has another user. I did not manage to avoid these cases when working with the DAGs (local only), so some kind of pseudo-expander might be more powerful here. That probably requires more work, and I am not sure if trading an LHI for an LR is bad, since the comparison does not clobber the register...
There are less comparisons with memory - the value is now loaded, compared and reused (see fun9 below).
Does this seem like a good idea to try?
New tests, master <> patched (skipped functions identical):
fun0: fun0:
chi %r2, 0 chi %r2, 0
lhi %r2, 0 <
lochilh %r2, 1 lochilh %r2, 1
>
br %r14 br %r14
fun4: fun4:
cghi %r2, 0 cghi %r2, 0
lhi %r2, 0 <
lochilh %r2, 1 lochilh %r2, 1
>
br %r14 br %r14
fun5: fun5:
cghsi 0(%r1), 0 | ltg %r0, 0(%r1)
lghi %r0, 0 <
locghilh %r0, 1 locghilh %r0, 1
stg %r0, 0(%r1) stg %r0, 0(%r1)
br %r14 br %r14
fun6: fun6:
chi %r2, 1 chi %r2, 1
lhi %r2, 0 | lochilh %r2, 0
lochie %r2, 1 |
br %r14 br %r14
fun8: fun8:
cghi %r2, 1 cghi %r2, 1
lhi %r2, 0 | lochilh %r2, 0
lochie %r2, 1 |
br %r14 br %r14
fun9: fun9:
cghsi 0(%r1), 1 | lg %r0, 0(%r1)
lghi %r0, 0 | cghi %r0, 1
locghie %r0, 1 | locghilh %r0, 0
stg %r0, 0(%r1) stg %r0, 0(%r1)
br %r14 br %r14
https://reviews.llvm.org/D98905
Files:
llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
llvm/lib/Target/SystemZ/SystemZISelLowering.h
llvm/test/CodeGen/SystemZ/int-cmp-59.ll
llvm/test/CodeGen/SystemZ/setcc-05.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D98905.331705.patch
Type: text/x-patch
Size: 10588 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210318/576c486d/attachment.bin>
More information about the llvm-commits
mailing list