[PATCH] D27861: [DAGCombiner] Match load by bytes idiom and fold it into a single load. Attempt #2.
Artur Pilipenko via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 9 05:03:31 PST 2017
apilipenko added a comment.
Currently update_llc_test_checks.py supports arm-eabi target only. I left ARM test cases with manually written checks for now.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4481
+ const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+ if (!TLI.isOperationLegal(ISD::LOAD, VT))
+ return SDValue();
----------------
RKSimon wrote:
> What is the effect of changing this to:
> ```
> if (LegalOperations && !TLI.isOperationLegal(ISD::LOAD, VT))
> ```
> Would the legalize do such a bad job of splitting poorly combined loads/bswaps?
This looks like a good idea, it enables combining of i64 pattern to two i32 loads on 32 bit targets (first loads are combined to a single i64 load and then it is split into to i32 loads).
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4559
+ bool NeedsBswap = DAG.getDataLayout().isBigEndian() != BigEndian;
+ if (NeedsBswap && !TLI.isOperationLegal(ISD::BSWAP, VT))
+ return SDValue();
----------------
filcab wrote:
> RKSimon wrote:
> > Would this work?
> > ```
> > if (NeedsBswap && LegalOperations && !TLI.isOperationLegal(ISD::BSWAP, VT))
> > ```
> I wonder if it's useful to generate a bswap only to change it back later. Do you have an example of something llvm already does? Or would this be a future optimization possibility?
As a result we have a single load followed by an instruction sequence doing the swap. E.g. for load_i32_by_i8_bswap from test/CodeGen/ARM/load-combine.ll we'll have:
```
ldr r0, [r0]
mov r1, #65280
mov r2, #16711680
and r1, r1, r0, lsr #8
and r2, r2, r0, lsl #8
orr r1, r1, r0, lsr #24
orr r0, r2, r0, lsl #24
orr r0, r0, r1
```
instead of
```
ldrb r2, [r0, #1]
ldrb r1, [r0]
ldrb r3, [r0, #2]
ldrb r0, [r0, #3]
lsl r2, r2, #16
orr r1, r2, r1, lsl #24
orr r1, r1, r3, lsl #8
orr r0, r1, r0
```
Assuming that shuffling bytes in a register is cheaper that loading from memory it looks like a generally good transformation.
https://reviews.llvm.org/D27861
More information about the llvm-commits
mailing list