[PATCH] D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0

Wed Nov 3 07:31:47 PDT 2021

fzhinkin added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:3517
+  // Reduce all values using OR.
+  for (size_t Index = 0; Index + 1 < Result.size(); Index += 2) {
+    SDValue NewOr = DAG.getNode(ISD::OR, DL, N0.getValueType(), Result[Index],
----------------
RKSimon wrote:
> fzhinkin wrote:
> > RKSimon wrote:
> > > Can't you avoid pushing results by just OR'ing the results once?
> > > ```
> > > SDValue Reduction = Result[0];
> > > for (size_t I = 1, E = Results.size(); I < E; ++I)
> > >   Reduction = DAG.getNode(ISD::OR, DL, N0.getValueType(), Reduction, Result[I]);
> > > ```
> > I'm pushing ORs back to the results list to generate balanced tree (and shorten critical path's length). And it pays off at least for ARM:
> > 
> > ```
> > ; llc -O3 -mtriple=armv7 test.ll
> > define i1 @opt_setcc_shl_ne_zero_i128(i128 %a) nounwind {
> >    %shl = shl i128 %a, 17
> >    %cmp = icmp ne i128 %shl, 0
> >    ret i1 %cmp
> > }
> > ```
> > 
> > Code generated using current implementation:
> > ```
> > opt_setcc_shl_ne_zero_i128:
> > @ %bb.0:
> > 	orr	r2, r2, r3, lsl #17
> > 	orr	r0, r1, r0
> > 	orrs	r0, r0, r2
> > 	movwne	r0, #1
> > 	bx	lr
> > ```
> > 
> > Code generated using implementation OR'ing in place:
> > ```
> > opt_setcc_shl_ne_zero_i128:
> > @ %bb.0:
> > 	orr	r0, r1, r0
> > 	orr	r0, r0, r2
> > 	orr	r0, r0, r3, lsl #17
> > 	cmp	r0, #0
> > 	movwne	r0, #1
> > 	bx	lr
> > ```
> OK - maybe mention that in the comment?
Updated the comment and fixed all other issues you've mentioned earlier.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111530/new/

https://reviews.llvm.org/D111530