[llvm-dev] Questions about code-size optimizations in ARM backend
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Tue Nov 7 12:08:37 PST 2017
On 11/7/2017 9:02 AM, Gabor Ballabas wrote:
>
> Hi All,
>
> I started to work on code-size improvements on ARM target by comparing
> GCC and LLVM generated code.
> My first candidate was switch-case lowering.
> I also created a Bugzilla issue for this topic:
> https://bugs.llvm.org/show_bug.cgi?id=34902
> The full example code and the generated assembly for GCC and for LLVM
> is in the Bugzilla issue.
>
> My first idea was to simplify the following instruction pattern
> *lsl r0, r0, #2**
> ** ldr pc, [r0, r1]*
> to this:
> *ldr pc, [r1, r0, lsl #2]*
>
> but then I got really confused when I started to look into the
> machine-dependent optimization passes in the backend.
>
> I get a dump with the '-print-machineinstrs' option from the
> MachineFunctionPass and I can see these instructions in the beginning
> of the passes
>
> *%vreg2<def> = MOVsi %vreg1, 18, pred:14, pred:%noreg, opt:%noreg;
> GPR:%vreg2,%vreg1**
> ** %vreg3<def> = LEApcrelJT <jt#0>, pred:14, pred:%noreg; GPR:%vreg3**
> ** BR_JTm %vreg2<kill>, %vreg3<kill>, 0, <jt#0>; mem:LD4[JumpTable]
> GPR:%vreg2,%vreg3*
>
> and these at the end
>
> *%R0<def> = MOVsi %R0<kill>, 18, pred:14, pred:%noreg, opt:%noreg**
> ** %R1<def> = LEApcrelJT <jt#0>, pred:14, pred:%noreg**
> ** BR_JTm %R0<kill>, %R1<kill>, 0, <jt#0>; mem:LD4[JumpTable]*
>
"lsl r0, r0, #2" is an alias for "mov r0, r0, lsl #2", which is the
MachineInstr "MOVsi".
LEApcrelJT and BR_JTm are pseudo-instructions which correspond to "adr"
and "ldr" respectively. We use a special opcode for the jump-table
address because we have to do some extra work in ARMConstantIslands for
instructions which use constant pools. We use a special opcode for the
load so we can mark it as a branch (which matters for modeling the CFG).
> So basically I want to catch the pattern with the possible
> simplification using the shifter,
> but I'm not even sure that I am looking into this issue at the right
> optimization level.
> Maybe this idea should be implemented in a higher level, or as a fixup
> in ARMConstantIslands,
> like the Thumb jumptable optimizations mentioned in the Bugzilla issue.
>
> I hope someone more familiar with this part of the backend can give me
> some pointers about how to proceed with this idea
> ( or why it is complete rubbish in the first place :) )
>
If you just want to pull the shift into the load, you can probably get
away with just messing with instruction selection for BR_JTm. There's
actually a FIXME in ARMInstrInfo.td which is relevant ("FIXME: This
shouldn't use the generic addrmode2, but rather be split into i12 and rs
suffixed versions.")
If you want to do the fancy version where "pc" is part of the addressing
mode, you probably need to do something in ARMConstantIslands (since the
transform requires the jump table to be placed directly after the jump.)
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171107/3902361f/attachment.html>
More information about the llvm-dev
mailing list