[PATCH] D70487: [DAGCombiner] Allow zextended load combines.
Clement Courbet via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 21 04:22:16 PST 2019
courbet marked an inline comment as done.
courbet added inline comments.
================
Comment at: llvm/test/CodeGen/ARM/load-combine-big-endian.ll:637
+; CHECK-NEXT: lsl r0, r0, #16
+; CHECK-NEXT: orr r0, r1, r0, lsr #24
; CHECK-NEXT: mov pc, lr
----------------
dmgreen wrote:
> courbet wrote:
> > dmgreen wrote:
> > > courbet wrote:
> > > > RKSimon wrote:
> > > > > regression
> > > > Yes, that's something I meant to highlight in thsi patch, but forgot to send the comment:
> > > >
> > > > I've seen in other tests for ARM we're OK paying quite a large number of instructions to avoid loads, e.g.
> > > >
> > > > ```
> > > > define i32 @load_i32_by_i8_bswap(i32* %arg) {
> > > > ; BSWAP is not supported by 32 bit target
> > > > ; CHECK-LABEL: load_i32_by_i8_bswap:
> > > > ; CHECK: @ %bb.0:
> > > > ; CHECK-NEXT: ldr r0, [r0]
> > > > ; CHECK-NEXT: mov r1, #65280
> > > > ; CHECK-NEXT: mov r2, #16711680
> > > > ; CHECK-NEXT: and r1, r1, r0, lsr #8
> > > > ; CHECK-NEXT: and r2, r2, r0, lsl #8
> > > > ; CHECK-NEXT: orr r1, r1, r0, lsr #24
> > > > ; CHECK-NEXT: orr r0, r2, r0, lsl #24
> > > > ; CHECK-NEXT: orr r0, r0, r1
> > > > ; CHECK-NEXT: mov pc, lr
> > > > ```
> > > >
> > > > I'm not very familiar with these targets, where do we draw the line ?
> > > This appears to be running under a very old architecture where bswap is not legal.
> > >
> > > Any idea why it is trying to perform this optimisation when there is no bswap to make it efficient?
> > ```
> > // Before legalize we can introduce illegal bswaps which will be later
> > // converted to an explicit bswap sequence. This way we end up with a single
> > // load and byte shuffling instead of several loads and byte shuffling.
> > if (NeedsBswap && LegalOperations && !TLI.isOperationLegal(ISD::BSWAP, VT))
> > return SDValue();
> > ```
> >
> > I can always bail out on zext:
> >
> > ```
> > if (NeedsBswap && (LegalOperations || NeedsZext) && !TLI.isOperationLegal(ISD::BSWAP, VT))
> > return SDValue();
> > ```
> >
> > But it's not clear to me what's worth between a load and 4 arithmetic instructions on ARM.
> >
> I think:
> - On any remotely modern ARM system we will have a rev instruction, so I'm not very worried about the regressions here. They do show that an architecture without bswap may get worse though, if that is a thing that comes up in other places (I think all the archs here will have a bswap of some sort).
> - I think counting instructions can be as good a measure as any on some of the smaller arm devices. Maybe something like 2:1 for loads to arithmetic.
> - I'm not sure why we would do this at-all if we need a bswap but don't have one. But if the VT is still an illegal type, it will likely always look illegal even if it is eventually expanded into something sane.
>
> Can you add these two targets to the two test files:
> -mtriple=thumbv6m-none-eabi
> -mtriple=thumbv7m-none-eabi
>
> If they are OK (along with the armv6 case) then I, from an ARM perspective, am happy with this.
> I think counting instructions can be as good a measure as any on some of the smaller arm devices. Maybe something like 2:1 for loads to arithmetic.
OK, so it looks like this was a real regression. So I've disabled illegal bswaps when zext-ending. Thanks for the information !
> Can you add these two targets to the two test files:
Done.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70487/new/
https://reviews.llvm.org/D70487
More information about the llvm-commits
mailing list