[PATCH] D64090: [Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)

Tue Jul 2 16:28:35 PDT 2019

lebedev.ri added inline comments.

================
Comment at: test/CodeGen/ARM/inc-of-add.ll:528
+; THUMB6-NEXT:    sbcs r3, r4
+; THUMB6-NEXT:    pop {r4, r5, r7, pc}
 ;
----------------
efriedma wrote:
> lebedev.ri wrote:
> > efriedma wrote:
> > > lebedev.ri wrote:
> > > > efriedma wrote:
> > > > > lebedev.ri wrote:
> > > > > > efriedma wrote:
> > > > > > > Before we treat vectors differently from scalars, should we check the vector is actually legal?
> > > > > > > 
> > > > > > > In this case, we actually save an instruction (Thumb1 doesn't support adc with immediate), but that seems unintended.
> > > > > > I have tried `VT.isScalarInteger() || !isTypeLegal(VT);` on ARM and PPC, and the diff says it's worse in total.
> > > > > > Could you be a bit more specific as to what you envision?
> > > > > > 
> > > > > On targets where a vector is going to be scalarized, we should give the same result for the vector type and the scalarized type.  For example, we should give `<2 x i64>` and `i64`  the same treatment on ARM without NEON.
> > > > > 
> > > > > If that's making code worse for some vector testcases, maybe we aren't making the correct choice for some scalar types.  For example, it looks like we should prefer not+sub for i64 on Thumb1.
> > > > Ok, i'm feeling dumber than usual at the moment.
> > > > How should that all look? Could you **please** be a bit more specific?
> > > > ```
> > > > bool ARMTargetLowering::preferIncOfAddToSubOfNot(EVT VT,
> > > >                                                  CombineLevel Level) const {
> > > >   return VT.isScalarInteger() || Level >= AfterLegalizeVectorOps;
> > > > }
> > > > ```
> > > > ^ does nothing.
> > > Maybe something along the lines of:
> > > 
> > > ```
> > > if (!Subtarget->hasNEON()) {
> > >   if (Thumb1Only())
> > >     return VT.getScalarSizeInBits() <= 32;
> > >   return true;
> > > }
> > > return VT.isScalarInteger();
> > > ```
> > > 
> > > That might not do the right thing for some edge cases with really weird types, but I think it should be roughly correct.
> > I'm sorry, i believe i'm still missing the point here.
> > In this function, for `THUMB6` check lines, *this* is an improvement currently, no? 15 lines vs 14 lines
> > Do we want to preserve LHS of the diff here?
> There are two separate issues here:
> 
> 1. We want to use sub-of-not for i64 on Thumb1 i.e., use the RHS here, and the LHS from scalar_i64.
> 2. We want to treat scalar types and vector types the same way on targets without neon (i.e. always return true on non-Thumb1 targets that don't have NEON).
> 
> My suggestion addresses both.
> We want to use sub-of-not for i64 on Thumb1 i.e., use the RHS here, and the LHS from scalar_i64.

Ooh, that clears things up a bit :)
Thank you, hopefully this is better.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64090/new/

https://reviews.llvm.org/D64090