[llvm-dev] [Proposal][RFC] Epilog loop vectorization
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Tue Mar 14 09:49:37 PDT 2017
On 03/14/2017 11:21 AM, Adam Nemet wrote:
>
>> On Mar 14, 2017, at 6:00 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com
>> <mailto:Ashutosh.Nema at amd.com>> wrote:
>>
>> Summarizing the discussion on the implementation approaches.
>> Discussed about two approaches, first running ‘InnerLoopVectorizer’
>> again on the epilog loop immediately after vectorizing the original
>> loop within the same vectorization pass, the second approach where
>> re-running vectorization pass and limiting vectorization factor of
>> epilog loop by metadata.
>> <Approach-2>
>> Challenges with re-running the vectorizer pass:
>> 1)Reusing alias check result:
>> When vectorizer pass runs again it finds the epilog loop as a new
>> loop and it may generates alias check, this new alias check may
>> overkill the gains of epilog vectorization.
>> We should use the already computed alias check result instead of re
>> computing again.
>> 2)Rerun the vectorizer and hoist the new alias check:
>> It’s not possible to hoist alias checks as its not fully redundant
>> (not dominated by other checks), it’s not getting execute in all paths.
>
>
> I don’t understand. Looks like you have the same alias checks for the
> epilog loop too. How is this CFG different from the re-vectorization
> of the scalar loop?
You're looking at the wrong thing. This *is* the image from
re-vectorization. The other image (linked below in step (3)) shows the
other option.
> Would be good to have both CFGs here and highlighting the difference.
>
> I thought that the whole point was that *if* you reached the epilog
> vector loop via the first vector loop, you want to bypass the alias
> checks before the epilog vector.
Yes, but, that's not quite true now. You can also reach the epilogue
loop if you fail the min-trip-count check, and so you don't know
anything about the aliasing checks.
-Hal
>
> I still don’t understand why that’s not possible with some
> sophisticated predicate propagation independent from the vectorizer.
> I am not saying it’s already possible but it should be.
>
> Adam
>
>
>> NOTE: We cannot prepone alias check as its expensive compared to
>> other checks.
>> <Approach-1>
>> 1)Current patch depends on the existing functionality of
>> LoopVectorizer, it uses ‘InnerLoopVectorizer’ again to vectorize the
>> epilog loop, as it happens in the same vectorization pass we have
>> flexibility to reuse already computed alias result check & limit
>> vectorization factor for the epilog loop.
>> 2)It does not generate the blocks for new block layout explicitly,
>> rather it depends on ‘InnerLoopVectorizer::createEmptyLoop’ to
>> generate new block layout. The new block layout get automatically
>> generated by calling the ‘InnerLoopVectorizer:: vectorize’ again.
>> 3)Block layout description with epilog loop vectorization is available at
>> https://reviews.llvm.org/file/data/fxg5vx3capyj257rrn5j/PHID-FILE-x6thnbf6ub55ep5yhalu/LayoutDescription.png
>> Approach-1 looks feasible, please comment if any objections.
>> Regards,
>> Ashutosh
>> *From:*Nema, Ashutosh
>> *Sent:*Wednesday, March 1, 2017 10:42 AM
>> *To:*'Daniel Berlin' <dberlin at dberlin.org <mailto:dberlin at dberlin.org>>
>> *Cc:*anemet at apple.com <mailto:anemet at apple.com>; Hal Finkel
>> <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>; Zaks, Ayal
>> <ayal.zaks at intel.com <mailto:ayal.zaks at intel.com>>; Renato Golin
>> <renato.golin at linaro.org
>> <mailto:renato.golin at linaro.org>>;mkuper at google.com
>> <mailto:mkuper at google.com>; Mehdi Amini <mehdi.amini at apple.com
>> <mailto:mehdi.amini at apple.com>>; llvm-dev <llvm-dev at lists.llvm.org
>> <mailto:llvm-dev at lists.llvm.org>>
>> *Subject:*RE: [llvm-dev] [Proposal][RFC] Epilog loop vectorization
>> Sorry I misunderstood, gvn/newgvn/gvnhoist cannot help here as these
>> checks are not dominated by all paths.
>> Regards,
>> Ashutosh
>> *From:*Daniel Berlin [mailto:dberlin at dberlin.org]
>> *Sent:*Tuesday, February 28, 2017 6:58 PM
>> *To:*Nema, Ashutosh <Ashutosh.Nema at amd.com
>> <mailto:Ashutosh.Nema at amd.com>>
>> *Cc:*anemet at apple.com <mailto:anemet at apple.com>; Hal Finkel
>> <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>; Zaks, Ayal
>> <ayal.zaks at intel.com <mailto:ayal.zaks at intel.com>>; Renato Golin
>> <renato.golin at linaro.org
>> <mailto:renato.golin at linaro.org>>;mkuper at google.com
>> <mailto:mkuper at google.com>; Mehdi Amini <mehdi.amini at apple.com
>> <mailto:mehdi.amini at apple.com>>; llvm-dev <llvm-dev at lists.llvm.org
>> <mailto:llvm-dev at lists.llvm.org>>
>> *Subject:*Re: [llvm-dev] [Proposal][RFC] Epilog loop vectorization
>> Hoisting or removing?
>> Neither pass does hoisting, you'd need gvnhoist for that.
>> Even then:
>> Staring at your example, none of the checks are fully redundant (IE
>> they are not dominated by other checks) or execute on all paths.
>> Thus, hoisting them would be purely speculative code motion, which
>> none of our passes do.
>> If you would like these sets of checks to be removed, you would need
>> to place them in a place that they execute unconditionally.
>> Otherwise, this is not a standard code hoisting/removal transform.
>> The only redundancy i can see here at all is the repeated
>> getelementptr computation.
>> If you move it to the preheader, it will be eliminated.
>> Otherwise, none of the checks are redundant.
>>
>> What would you hope to happen in this case?
>> On Tue, Feb 28, 2017 at 5:09 AM, Nema, Ashutosh
>> <Ashutosh.Nema at amd.com <mailto:Ashutosh.Nema at amd.com>> wrote:
>>
>> I have tried running both gvn and newgvn but it did not helped in
>> hoisting the alias checks:
>> Please check, maybe I have missed something.
>> <TestCase>
>> void foo (char *A, char *B, char *C, int len) {
>> int i = 0;
>> for (i=0 ; i< len; i++)
>> A[i] = B[i] + C[i];
>> }
>> <Command>
>> $ opt –O3 –gvn test.ll –o test.opt.ll
>> $ opt –O3 –newgvn test.ll –o test.opt.ll
>> “test.ll” is attached, it got already vectorized by the approach
>> running vectorizer twice by annotate the remainder loop with
>> metadata to limit the vectorization factor for epilog vector loop.
>> Regards,
>> Ashutosh
>> *From:*anemet at apple.com
>> <mailto:anemet at apple.com>[mailto:anemet at apple.com
>> <mailto:anemet at apple.com>]
>> *Sent:*Tuesday, February 28, 2017 1:33 AM
>> *To:*Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>
>> *Cc:*Daniel Berlin <dberlin at dberlin.org
>> <mailto:dberlin at dberlin.org>>; Nema, Ashutosh
>> <Ashutosh.Nema at amd.com <mailto:Ashutosh.Nema at amd.com>>; Zaks,
>> Ayal <ayal.zaks at intel.com <mailto:ayal.zaks at intel.com>>; Renato
>> Golin <renato.golin at linaro.org
>> <mailto:renato.golin at linaro.org>>;mkuper at google.com
>> <mailto:mkuper at google.com>; Mehdi Amini <mehdi.amini at apple.com
>> <mailto:mehdi.amini at apple.com>>; llvm-dev
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
>> *Subject:*Re: [llvm-dev] [Proposal][RFC] Epilog loop vectorization
>>
>> On Feb 27, 2017, at 12:01 PM, Hal Finkel <hfinkel at anl.gov
>> <mailto:hfinkel at anl.gov>> wrote:
>> On 02/27/2017 01:47 PM, Daniel Berlin wrote:
>>
>> On Mon, Feb 27, 2017 at 11:29 AM, Adam Nemet
>> <anemet at apple.com <mailto:anemet at apple.com>> wrote:
>>
>> On Feb 27, 2017, at 10:11 AM, Hal Finkel
>> <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote:
>> On 02/27/2017 11:47 AM, Adam Nemet wrote:
>>
>> On Feb 27, 2017, at 9:39 AM, Daniel
>> Berlin <dberlin at dberlin.org
>> <mailto:dberlin at dberlin.org>> wrote:
>> On Mon, Feb 27, 2017 at 9:29 AM, Adam
>> Nemet <anemet at apple.com
>> <mailto:anemet at apple.com>> wrote:
>>
>> On Feb 27, 2017, at 7:27 AM, Hal
>> Finkel <hfinkel at anl.gov
>> <mailto:hfinkel at anl.gov>> wrote:
>>
>> On 02/27/2017 06:29 AM, Nema,
>> Ashutosh wrote:
>>
>> Thanks for looking into this.
>> 1) Issues with re running
>> vectorizer:
>> Vectorizer might generate
>> redundant alias checks while
>> vectorizing epilog loop.
>> Redundant alias checks are
>> expensive, we like to reuse
>> the results of already
>> computed alias checks.
>> With metadata we can limit
>> the width of epilog loop, but
>> not sure about reusing alias
>> check result.
>> Any thoughts on rerunning
>> vectorizer with reusing the
>> alias check result ?
>>
>>
>> One way of looking at this is:
>> Reusing the alias-check result is
>> really just a conditional
>> propagation problem; if we don't
>> already have an optimization that
>> can combine these after the fact,
>> then we should.
>>
>> +Danny
>> Isn’t Extended SSA supposed to help
>> with this?
>>
>> Yes, it will solve this with no issue
>> already. GVN probably does already too.
>> even if if you have
>> if (a == b)
>> if (a == c)
>> if (a == d)
>> if (a == e)
>> if (a == g)
>> and we can prove a ... g equivalent,
>> newgvn will eliminate them all and set
>> all the branches true.
>> If you need a simpler clean up pass, we
>> could run it on sub-graphs.
>>
>> Yes we probably don’t want to run a full GVN
>> after the “loop-scheduling” passes.
>>
>>
>> FWIW, we could, just without the
>> memory-dependence analysis enabled (i.e. set the
>> NoLoads constructor parameter to true). GVN is
>> pretty fast in that mode.
>>
>> OK. Another data point is that I’ve seen cases in the
>> past where the alias checks required for the loop
>> passes could enable GVN to remove redundant
>> loads/stores. Currently we can only pick these up
>> with LTO when GVN is rerun.
>>
>> This is just GVN brokenness, newgvn should not have this
>> problem.
>> If it does, i'd love to see it.
>>
>>
>> I thought that the problem is that we just don't run GVN
>> after that point in the pipeline.
>>
>> Yeah, that is the problem but I think Danny misunderstood what I
>> was trying to say.
>> This was a datapoint to possibly rerun GVN with memory-awareness.
>>
>>
>> -Hal
>>
>> (I'm working on the last few parts of turning it on by
>> default, but it requires a new getModRefInfo interface to
>> be able to get the last few testcases)
>>
>> --
>>
>> Hal Finkel
>>
>> Lead, Compiler Technology and Programming Languages
>>
>> Leadership Computing Facility
>>
>> Argonne National Laboratory
>>
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170314/b6c4e0a2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 18444 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170314/b6c4e0a2/attachment-0001.jpe>
More information about the llvm-dev
mailing list