[PATCH] [ARM] Teach the ARM Load Store Optimizer to collapse ldr/str's to ldrd/strd's
Ranjeet Singh
ranjeet.singh at arm.com
Wed Apr 29 08:32:44 PDT 2015
Hi John and Renato,
Thanks for the review comments.
> I don't know much about the ARM Load/Store Optimizer but from looking at it there's already some machinery for generating LDRD/STRD
The other machinery you're referring to is part of the ARM Pre RegAlloc Pass "Pre- register allocation pass that move load / stores from consecutive locations close to make it more likely they will be combined later.".
I haven't looked too much at this pass but I do see the method 'CanFormLdStDWord' which I could use.
> Why is this a separate LoadStoreToDoubleOpti function instead of being integrated into LoadStoreMultipleOpti?
I thought the code would be cleaner if this was done in a separate function to LoadStoreMultipleOpti.
The grouping algorithm I use (AArch64 uses) for finding ldr/str's to pair together is different to what
LoadStoreMultipleOpti uses. But I guess I can plug in my code inside the if(TryMerge) { ... } region and
instead of collapsing all the load/stores into an ldm/stm for V7M I can instead iterate through the list 2 at a time.
If you prefer it to be part of LoadStoreMultipleOpti then I can rework the patch to make it so.
> Also: when optimizing for size we would want to use LDM if it means less bytes worth of instructions.
Ok.
> Can you be more specific about what you adapted?
The AArch64 Load/Store optimizer collapses pairs of ldr/str instructions to ldp/stp instructions.
The algorithm it uses to find pairs of ldr/str instructions and the merging step
is what I needed to get the ARM Load/Store Optimizer to collapse ldr/str instructions into ldrd/strd instructions. So what
I took from the AArch64 backend is the following:
- AArch64LoadStoreOpt::optimizeBlock - renamed to ARMLoadStoreOpt::LoadStoreToDoubleOpti
- main loop almost the same except it looks for ldr/str thumb2 instructions to collapse
- AArch64LoadStoreOpt::findMatchingInsn - ARMLoadStoreOpt::findMatchingInsn
- Most of it is the same except I've removed unscaled offset checking. I also added a check for "Cortex-M3 errata 602117".
- AArch64LoadStoreOpt::mergePairedInsns - renamed to ARMLoadStoreOpt::mergePairedInsns
- almost the same but removed anything to do with sign extensions that was in the AArch64 Load/Store.
Ideally I wouldn't have had to copy over code from the AArch64 backend but there isn't a shared code directory for the AArch64 and ARM backends.
> Keeping the same comment in the same places is not always the correct thing to do
I've kept the same comments in places where I couldn't have put the description of the behaviour of the code any better myself.
http://reviews.llvm.org/D9298
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list