[PATCH] [ARM] Teach the ARM Load Store Optimizer to collapse ldr/str's to ldrd/strd's

Ranjeet Singh ranjeet.singh at arm.com
Wed Apr 29 08:32:44 PDT 2015


Hi John and Renato,

Thanks for the review comments.

> I don't know much about the ARM Load/Store Optimizer but from looking at it there's already some machinery for generating LDRD/STRD


The other machinery you're referring to is part of the ARM Pre RegAlloc Pass "Pre- register allocation pass that move load / stores from consecutive locations close to make it more likely they will be combined later.".
I haven't looked too much at this pass but I do see the method 'CanFormLdStDWord' which I could use.

> Why is this a separate LoadStoreToDoubleOpti function instead of being integrated into LoadStoreMultipleOpti?


I thought the code would be cleaner if this was done in a separate function to LoadStoreMultipleOpti.
The grouping algorithm I use (AArch64 uses) for finding ldr/str's to pair together is different to what
LoadStoreMultipleOpti uses. But I guess I can plug in my code inside the if(TryMerge) { ... } region and
instead of collapsing all the load/stores into an ldm/stm for V7M I can instead iterate through the list 2 at a time.
If you prefer it to be part of LoadStoreMultipleOpti then I can rework the patch to make it so.

> Also: when optimizing for size we would want to use LDM if it means less bytes worth of instructions.


Ok.

> Can you be more specific about what you adapted?


The AArch64 Load/Store optimizer collapses pairs of ldr/str instructions to ldp/stp instructions. 
The algorithm it uses to find pairs of ldr/str instructions and the merging step
is what I needed to get the ARM Load/Store Optimizer to collapse ldr/str instructions into ldrd/strd instructions. So what
I took from the AArch64 backend is the following:

- AArch64LoadStoreOpt::optimizeBlock - renamed to ARMLoadStoreOpt::LoadStoreToDoubleOpti
  - main loop almost the same except it looks for ldr/str thumb2 instructions to collapse
- AArch64LoadStoreOpt::findMatchingInsn - ARMLoadStoreOpt::findMatchingInsn
  - Most of it is the same except I've removed unscaled offset checking. I also added a check for "Cortex-M3 errata 602117".
- AArch64LoadStoreOpt::mergePairedInsns - renamed to ARMLoadStoreOpt::mergePairedInsns
  - almost the same but removed anything to do with sign extensions that was in the AArch64 Load/Store.

Ideally I wouldn't have had to copy over code from the AArch64 backend but there isn't a shared code directory for the AArch64 and ARM backends.

> Keeping the same comment in the same places is not always the correct thing to do


I've kept the same comments in places where I couldn't have put the description of the behaviour of the code any better myself.


http://reviews.llvm.org/D9298

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list