[PATCH] D70487: [DAGCombiner] Allow zextended load combines.

Thu Nov 21 04:22:16 PST 2019

courbet marked an inline comment as done.
courbet added inline comments.

================
Comment at: llvm/test/CodeGen/ARM/load-combine-big-endian.ll:637
+; CHECK-NEXT:    lsl r0, r0, #16
+; CHECK-NEXT:    orr r0, r1, r0, lsr #24
 ; CHECK-NEXT:    mov pc, lr
----------------
dmgreen wrote:
> courbet wrote:
> > dmgreen wrote:
> > > courbet wrote:
> > > > RKSimon wrote:
> > > > > regression
> > > > Yes, that's something I meant to highlight in thsi patch, but forgot to send the comment:
> > > > 
> > > > I've seen in other tests for ARM we're OK paying quite a large number of instructions to avoid loads, e.g.
> > > > 
> > > > ```
> > > > define i32 @load_i32_by_i8_bswap(i32* %arg) {
> > > > ; BSWAP is not supported by 32 bit target
> > > > ; CHECK-LABEL: load_i32_by_i8_bswap:
> > > > ; CHECK:       @ %bb.0:
> > > > ; CHECK-NEXT:    ldr r0, [r0]
> > > > ; CHECK-NEXT:    mov r1, #65280
> > > > ; CHECK-NEXT:    mov r2, #16711680
> > > > ; CHECK-NEXT:    and r1, r1, r0, lsr #8
> > > > ; CHECK-NEXT:    and r2, r2, r0, lsl #8
> > > > ; CHECK-NEXT:    orr r1, r1, r0, lsr #24
> > > > ; CHECK-NEXT:    orr r0, r2, r0, lsl #24
> > > > ; CHECK-NEXT:    orr r0, r0, r1
> > > > ; CHECK-NEXT:    mov pc, lr
> > > > ```
> > > > 
> > > > I'm not very familiar with these targets, where do we draw the line ?
> > > This appears to be running under a very old architecture where bswap is not legal.
> > > 
> > > Any idea why it is trying to perform this optimisation when there is no bswap to make it efficient?
> > ```
> > // Before legalize we can introduce illegal bswaps which will be later
> > // converted to an explicit bswap sequence. This way we end up with a single
> > // load and byte shuffling instead of several loads and byte shuffling.
> > if (NeedsBswap && LegalOperations && !TLI.isOperationLegal(ISD::BSWAP, VT))
> >   return SDValue();
> > ```
> > 
> > I can always bail out on zext:
> > 
> > ```
> > if (NeedsBswap && (LegalOperations || NeedsZext) && !TLI.isOperationLegal(ISD::BSWAP, VT))
> >   return SDValue();
> > ```
> > 
> > But it's not clear to me what's worth between a load and 4 arithmetic instructions on ARM.
> > 
> I think:
> - On any remotely modern ARM system we will have a rev instruction, so I'm not very worried about the regressions here. They do show that an architecture without bswap may get worse though, if that is a thing that comes up in other places (I think all the archs here will have a bswap of some sort).
> - I think counting instructions can be as good a measure as any on some of the smaller arm devices. Maybe something like 2:1 for loads to arithmetic.
> - I'm not sure why we would do this at-all if we need a bswap but don't have one. But if the VT is still an illegal type, it will likely always look illegal even if it is eventually expanded into something sane. 
> 
> Can you add these two targets to the two test files:
> -mtriple=thumbv6m-none-eabi
> -mtriple=thumbv7m-none-eabi
> 
> If they are OK (along with the armv6 case) then I, from an ARM perspective, am happy with this.
> I think counting instructions can be as good a measure as any on some of the smaller arm devices. Maybe something like 2:1 for loads to arithmetic.

OK, so it looks like this was a real regression. So I've disabled illegal bswaps when  zext-ending. Thanks for the information !

> Can you add these two targets to the two test files:

Done.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70487/new/

https://reviews.llvm.org/D70487