[PATCH] D51780: ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 12 10:41:28 PDT 2018
dmgreen accepted this revision.
dmgreen added a comment.
This revision is now accepted and ready to land.
> The benchmarks came back as about a 0.2% difference in cycle count, and (crucially) there's no way when deciding function alignment to check for OptSize so we'd inevitably pessimize some cases.
I think this what getPrefAlignment is for? As opposed to MinFunctionAlignment? I agree that that's a different issue though, and doesn't need to be done with this.
Although I personally think the definition of Os is a bit odd, and this is increasing codesize, everyone else I've talked to agreed with you that this is fine. Like you said, in (almost) all cases we get a performance win for the bytes we spend.
So LGTM!
Repository:
rL LLVM
https://reviews.llvm.org/D51780
More information about the llvm-commits
mailing list