[PATCH] D38196: [AArch64] Avoid interleaved SIMD store instructions for Exynos

Wed Oct 25 08:15:32 PDT 2017

kristof.beyls added a comment.

Just a quick remark/question before I go off-line for the rest of today:

In https://reviews.llvm.org/D38196#906627, @az wrote:

> > More importantly this potentially being an issue on cores for ST4 only and not ST2 highlights the need for AArch64SIMDInstrOpt::shouldExitEarly to not bail out early if it finds the optimization isn't useful for ST2, but instead it should also check the other optimizations (e.g. ST4) this pass can do.
>
> Agree but when the original pass is written, there is a concern from another target without the problem that we should a have a minimal check before really entering the pass. So, I am doing the smallest check that shows the problem. I have no problem doing both checks or doing the more expensive ST4 check only so that it can trigger the pass for ARM and Exynos. In case you are getting back to us soon about the ARM latency, then let's wait until then and we can then decide what to have in early exit code.

I thought the "shouldExitEarly" method aims to check if any of the transformations the pass can do can be profitable for the target at all.
My understanding is that this can be checked by just examining the instruction latencies, not having to look at the actual code being compiled at all. I assume that the saving of compile time comes from the fact that "shouldExitEarly" needs to do an amount of work proportional to the number of rewriting rules implemented, not proportional to the amount of code being compiled.
So, for the pass to work as expected, i.e. always run when for the target at least one of the transformations is profitable, I think shouldExitEarly should check all rewriting rules implemented.

Does that make sense, or am I misunderstanding something here?

https://reviews.llvm.org/D38196