[PATCH] D38196: [AArch64] Avoid interleaved SIMD store instructions for Exynos

Abderrazek Zaafrani via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 30 17:26:36 PDT 2017


az added a comment.

> Hmmmm.... this seems to be a tough one to find the right tradeoffs....
>  My personal opinions/observations:
> 
> 1. Making ShouldExitEarly stop the whole pass even if some of the optimizations would trigger is a hack that makes this pass operate in a non-obvious way and that will bite us in the future.
> 2. I guess there's a chance even more peephole-y patterns will be added to this pass in the future, which would make ShouldExitEarly even more expensive to check for all rewriting rules implemented. But also, if that function would take short cuts and heuristics, the pass would operate in more surprising and non-obvious ways.
> 
>   The above made me wonder if to keep compile-time under control, it wouldn't be better to integrate this in one of the existing frameworks for peepholes rather than in a pass of its own. And then I looked at https://reviews.llvm.org/D21571 where this pass got introduced and saw such approaches were probably tried and rejected in early version of that review. Right now, my best guess of the easiest way forward is to make ShouldExitEarly compute the profitability of all rewrite rules implemented, and let it cache the results of that analysis for each subtarget it encounters. That way, the cost of this computation should be amortized out over all functions the pass will handle, resulting in not having too much impact on compile time? Having said that, I don't know of another pass that has a caching mechanism like this. But I also haven't tried looking for such an example. I'm not sure if there's going to be a gotcha in trying to cache those results somehow. What do you think?
> 
>   Thanks,
> 
>   Kristof

I put a new version of the code that has all rewriting rules in shouldExitEarly(). I left the old one and name it ShouldExitEarlyShort() and added one rule to it representing ST4 just in case we go with it by the end (will remove one of them during final review). You are are right when you say that more patterns will be added in the future even for this pass (I have not done st3, and post increment store yet). So, the list will at least double.

I have not looked into the caching mechanism you are suggesting yet. Do you think it is acceptable coding standard for llvm to have such global data structure used between function units? It would help tremendously even though I think people with a sub-target that do not have a need for this pass will mind all these rewriting rules even if called one time.


https://reviews.llvm.org/D38196





More information about the llvm-commits mailing list