[PATCH] D38196: [AArch64] Avoid interleaved SIMD store instructions for Exynos

Kristof Beyls via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 24 07:49:36 PDT 2017


kristof.beyls added a comment.

In https://reviews.llvm.org/D38196#896905, @az wrote:

> > But my main question at this point is: did you benchmark this? What are the results of that? On which cores/benchmarks?
>
> As llvm is now getting better at generating interleaved memory access instructions, we are seeing a major degradation in a proprietary benchmark for Exynos because latest llvm generates st2 instructions in a hot loop. I can refer you to the llvm file lib/Target/AArch64/AArch64SchedM1.td for Exynos latencies (example: st2 takes 8 cycles, while zip1, zip2, and stp each take 1 cycle).


Thanks. I assume you confirmed there are no performance regressions at all in the test-suite?

> Note that other targets that we tested did not in general have this problem. However, I noticed through unit testing that ARM A57 had this problem with st4 only (and not with st2) even though A57 latencies in AArch64SchedA57.td do not indicate any problem and hence this pass does not affect A57 (somebody from ARM may need to double check on those latency).

I had a very brief glance and the latencies specified for ST4 in AArch64SchedA57.td. They didn't seem wrong at a glance. There's a chance you saw some other effects? I'll try to find some time to check those instruction latencies for Cortex-A57 more closely.
More importantly this potentially being an issue on cores for ST4 only and not ST2 highlights the need for AArch64SIMDInstrOpt::shouldExitEarly to not bail out early if it finds the optimization isn't useful for ST2, but instead it should also check the other optimizations (e.g. ST4) this pass can do.

Other than the above I don't think I've got other major concerns left-over.
Thanks very much for your prompt replies and actions!



================
Comment at: llvm/lib/Target/AArch64/AArch64VectorByElementOpt.cpp:1
 //=- AArch64VectorByElementOpt.cpp - AArch64 vector by element inst opt pass =//
 //
----------------
az wrote:
> kristof.beyls wrote:
> > I feel like the name of this pass/file should be changed, as the name no longer covers what it does.
> > It seems the pass now is becoming about replacing certain SIMD patterns with longer but more efficient SIMD patterns.
> > AArch64SIMDOptimizer might fit also with some of the file names of other optimizer passes already in the AArch64 target?
> I was planning to clean the naming afterward in a separate patch but I just did it now after seeing this Comment. I Changed classes names, pass name, variable name, etc. (as a result, more files with minor changes are included in the review related to changing the pass name). Note that I did not change the file name in here but will do when I commit the code. I chose AArch64SIMDInstrOpt instead of  AArch64SIMDOptimizer based on teh assumption that SIMDOptimizer may be confuse people into thinking that we may be doing some vectorization related work. SIMDInstr may convey the idea that we are doing optimization on the generated SIMD instructions. Let me know if you have other ideas about this.
Thanks, I think AArch64SIMDInstrOpt is a reasonable name for this.


================
Comment at: llvm/lib/Target/AArch64/AArch64VectorByElementOpt.cpp:74
+    Interleave,
+    unsupportedSub
+  } Subpass;
----------------
az wrote:
> kristof.beyls wrote:
> > s/unsupportedSub/UnsupportedSub/.
> > Do you need the UnsupportedSub value?
> I used to have an assert statement to look for enum value error at the top of the function (see original review). Then, I moved that error checking into the default case of the switch statement (with llvm_unreachable()) based on my understanding of one of your suggestions. That resulted in a compiler warning during the build related to the no need for the default case as all values are covered. So, I added the Unsupported in enum. In this new review, I changed the names of the extra enum value just in case it looks less confusing. While I prefer that, I am perfectly fine with removing them along with the default case and any enum related error checking.
My feel is that if the "NA/UnsupportedSub" enum value is only used to assert, it doesn't serve a practical purpose and therefore it should be removed?

Rereading my previous review comment that made you add a default case: I agree there shouldn't be a default case
(Furthermore, https://llvm.org/docs/CodingStandards.html#don-t-use-default-labels-in-fully-covered-switches-over-enumerations suggests not using "default" in switch statements when not necessary.)

Wouldn't the compiler warn when trying to set the Subpass enum to a value different from VectorElem/Interleave, and hence there isn't a need for asserts on this enum value?




https://reviews.llvm.org/D38196





More information about the llvm-commits mailing list