[llvm-dev] Unfolded additions of constants after promotion of @llvm.ctlz.i16 on SystemZ
Jonas Paulsson via llvm-dev
llvm-dev at lists.llvm.org
Mon Feb 11 19:17:46 PST 2019
Hi sanjay,
> If I'm seeing it correctly, (part of?) the fold you're looking for is
> here:
> https://reviews.llvm.org/rL350006
>
> ...but it's restricted to pre-legalization.
> I don't remember exactly what the problem was allowing that fold
> post-legalization, but maybe you can loosen that restriction?
>
Thanks! I tried just to remove the !LegalOperations condition
(DAGCombiner.cpp:10056), and indeed my problem was solved. Doing this on
SystemZ (for all of the opcodes) did not affect SPEC that much. Opcode
counts (trunk to left):
aghi : 38759 38742 -17
ahi : 34921 34936 +15
risbgn : 37104 37092 -12
nill : 2172 2183 +11
lr : 29731 29735 +4
sr : 6055 6059 +4
srk : 3743 3741 -2
lhi : 89566 89568 +2
risblg : 6528 6529 +1
la : 192375 192374 -1
Spill|Reload : 189670 189670 +0
So, to me it seems this could be the default on SystemZ at least.
/Jonas
> On Fri, Feb 8, 2019 at 10:20 AM Jonas Paulsson via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> Hi,
>
> SystemZ supports @llvm.ctlz.i64() natively with a single instruction
> (FLOGR), and lesser bitwidth versions of the intrinsic are
> promoted to i64.
>
> For some reason, this leads to unfolded additions of constants as
> shown
> below:
>
> This function:
>
> define i16 @fun(i16 %arg) {
> %1 = tail call i16 @llvm.ctlz.i16(i16 %arg, i1 false)
> ret i16 %1
> }
>
> ,gives this optimized DAG as input to instruction selection:
>
> SelectionDAG has 15 nodes:
> t0: ch = EntryToken
> t2: i32,ch = CopyFromReg t0, Register:i32 %0
> t10: i32 = and t2, Constant:i32<65535>
> t16: i64 = zero_extend t10
> t17: i64 = ctlz t16
> t22: i64 = add t17, Constant:i64<-32>
> t20: i32 = truncate t22
> t15: i32 = add t20, Constant:i32<-16>
> t7: ch,glue = CopyToReg t0, Register:i32 $r2l, t15
> t8: ch = SystemZISD::RET_FLAG t7, Register:i32 $r2l, t7:1
>
> It seems that SelectionDAG::computeKnownBits() has a case for
> ISD::CTLZ,
> and it seems to figure out that the high bits of t17 are zero, as
> expected.
>
> t17 is guaranteed to have a value between 48 and 64, so there
> could not
> be any overflow here, even though I am not sure if that's the
> problem or
> not... Should DAGCombiner::visitADD() handle this, or perhaps
> visitTRUNCATE()?
>
> Thanks for any help,
>
> Jonas
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190211/45285371/attachment.html>
More information about the llvm-dev
mailing list