[PATCH] D19885: [AArch64] Decouple zero store promotion from narrow ld merge. NFC.

Tue May 3 13:14:02 PDT 2016

junbuml added a comment.

In our internal tests, we found performance regressions with the narrow load merge in some cases. Initially, this optimization was driven by the +3% performance gain in spec2006/h264ref that has a load intensive hot loop. However, the gain I was targeting in h264ref is now completely covered by  SLP vectorizer.

As this optimization converts two loads into one load with two shift instructions, it could potentially hurt performance if a loop is arithmetic operation intensive.

Through this change I want to let other people run performance test with/without the narrow load merge. If there is no objection I would like to disable the narrow load merge by default in separate patch.

http://reviews.llvm.org/D19885