[llvm-dev] Loop Distribution pass
Gerolf Hoflehner via llvm-dev
llvm-dev at lists.llvm.org
Mon Dec 10 23:26:45 PST 2018
> On Sep 20, 2018, at 9:11 AM, Jonas Paulsson via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> Hi Adam,
> On 2018-09-19 19:26, Adam Nemet wrote:
>>> On Sep 13, 2018, at 1:21 AM, Jonas Paulsson <paulsson at linux.vnet.ibm.com> wrote:
>>> I found with the help of the optimization remarks a loop that could not be vectorized, but if loop distribution was enabled this may happen, which it in fact did with a very significant benchmark improvement (~25%).
>>> I tried (on SystemZ) to enable this pass, and found that it only affected a handful of files on SPEC. This means I could enable this without worrying about any regressions on SystemZ at least currently.
>>> I wonder if there is something more to know about this. It seems that no other target has enabled this due to general mixed results, or? Is this triggering much more on other targets, and if so, why?
>> The main thing that is missing from the pass right now is a serious analysis of profitability as it affects instruction- and memory-level parallelism. The easiest to see this that LD is a reverse transformation of Loop fusion so where LF helps LD may regress. MLP is the big one in my opinion which would totally reverse any gains from vectorization.
>> We would probably have to do similar things to the SW prefetch insertion pass in order to analyze accesses that are likely to be skipped by the HW prefetcher. Needless to say this is a very micro-architecture specific analysis/cost model. If we can establish that ILP/MPL is unaffected even in simplest cases and vectorization is enabled we could enable the transformation by default (in addition to the pragma-driven approach we have now).
> Thanks for the reply.
> Since this is today extremely conservative and nearly never triggers, at least on SystemZ, while still being very beneficial when it does happen, it seems that this could be used as-is now on SystemZ with a new TTI hook to enable it selectively per target.
> The question now is if this is a wise idea? Do you think things will change significantly with the Loop Distribution pass in the direction that it gets much more enabled, which may then cause regressions on SystemZ? If that is the case, perhaps the idea now is that nobody activates it per default until some initial reasonable cost modeling has been made?
It seems this hasn’t been answered. Apologies for my late ‘jump in’ otherwise. It is true the major reason LD is not on by default is because it lacks cost modeling to prevent regressions. This would hold for any hardware platform I think.
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
More information about the llvm-dev