XCore target: disable vectorization
Arnold Schwaighofer
aschwaighofer at apple.com
Fri Sep 13 09:51:23 PDT 2013
Yep. LGTM.
Thanks.
On Sep 13, 2013, at 11:47 AM, Robert Lytton <robert at xmos.com> wrote:
> Is this patch good to go?
> Robert
> ________________________________________
> From: llvm-commits-bounces at cs.uiuc.edu [llvm-commits-bounces at cs.uiuc.edu] on behalf of Robert Lytton [robert at xmos.com]
> Sent: 10 September 2013 18:06
> To: Arnold Schwaighofer
> Cc: llvm-commits at cs.uiuc.edu
> Subject: RE: XCore target: disable vectorization
>
> Thank you.
> I missed out the target datalayout earlier in my efforts... and did not retry all - opps!
> Now added changes to SLPVectorizer too.
> No changes needed for BBVectorizer - as long as bb-vectorize does not ignore target-info. (test added).
>
> Patch attached for review:
> Prevent LoopVectorizer and SLPVectorizer running if the target has no vector registers.
> XCore target: Add XCoreTargetTransformInfo
> This is where getNumberOfRegisters() resides, which in turn returns the
> number of vector registers (=0).
>
>
>
>> We have test/Analysis/CostModel to test costs of instructions, but they make only sense for you if you would be vectorizing.
> That is something that I need to look at.
>
> Robert
>
> ________________________________________
> From: Arnold Schwaighofer [aschwaighofer at apple.com]
> Sent: 10 September 2013 16:22
> To: Robert Lytton
> Cc: llvm-commits at cs.uiuc.edu; Nadav Rotem; Richard Osborne
> Subject: Re: XCore target: disable vectorization
>
> On Sep 10, 2013, at 9:25 AM, Robert Lytton <robert at xmos.com> wrote:
>
>> (subject moved to llvm-commits from cfe-commits http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130909/088308.html)
>>
>> Hi Arnold,
>>
>> I have looked into getting the xcore target to incorrectly run the slp vectorizer so that I can then 'fix it' but have had no success. Any thoughts?
>
> Did you try one of the examples in e.g. test/Transforms/SLPVectorizer/X86/ like simplebb.ll. You would substitute the triple and data layout. If it still doesn’t slp-vectorize try -mllvm -debug-only=SLP. The following works for me:
>
> $ cat > simplebb.ll
> ; RUN: opt < %s -basicaa -slp-vectorizer -dce -S -mtriple=xcore | FileCheck %s
>
> target datalayout =
> "e-p:32:32:32-a0:0:32-n32-i1:8:32-i8:8:32-i16:16:32-i32:32:32-i64:32:32-f16:16:32-f32:32:32-f64:32:32"
> target triple = "xcore"
>
> ; Simple 3-pair chain with loads and stores
> ; CHECK: test1
> ; CHECK: store <2 x double>
> ; CHECK: ret
> define void @test1(double* %a, double* %b, double* %c) {
> entry:
> %i0 = load double* %a, align 8
> %i1 = load double* %b, align 8
> %mul = fmul double %i0, %i1
> %arrayidx3 = getelementptr inbounds double* %a, i64 1
> %i3 = load double* %arrayidx3, align 8
> %arrayidx4 = getelementptr inbounds double* %b, i64 1
> %i4 = load double* %arrayidx4, align 8
> %mul5 = fmul double %i3, %i4
> store double %mul, double* %c, align 8
> %arrayidx5 = getelementptr inbounds double* %c, i64 1
> store double %mul5, double* %arrayidx5, align 8
> ret void
> }
>
> $ opt -basicaa -slp-vectorizer -dce -S -mtriple=xcore < simplebb.ll -debug-only=SLP
> …
> SLP: Decided to vectorize cost=-6
> SLP: Extracting 0 values .
> SLP: Erasing scalar: store double %mul, double* %c, align 8.
> SLP: Erasing scalar: store double %mul5, double* %arrayidx5, align 8.
> SLP: Erasing scalar: %mul = fmul double %i0, %i1.
> SLP: Erasing scalar: %mul5 = fmul double %i3, %i4.
> SLP: Erasing scalar: %i0 = load double* %a, align 8.
> SLP: Erasing scalar: %i3 = load double* %arrayidx3, align 8.
> SLP: Erasing scalar: %i1 = load double* %b, align 8.
> SLP: Erasing scalar: %i4 = load double* %arrayidx4, align 8.
> SLP: Optimizing 0 gather sequences instructions.
> SLP: vectorized "test1"
> ; ModuleID = '<stdin>'
> target datalayout = "e-p:32:32:32-a0:0:32-n32-i1:8:32-i8:8:32-i16:16:32-i32:32:32-i64:32:32-f16:16:32-f32:32:32-f64:32:32"
> target triple = "xcore"
>
> define void @test1(double* %a, double* %b, double* %c) {
> entry:
> %0 = bitcast double* %a to <2 x double>*
> %1 = load <2 x double>* %0, align 8
> %2 = bitcast double* %b to <2 x double>*
> %3 = load <2 x double>* %2, align 8
> %4 = fmul <2 x double> %1, %3
> %5 = bitcast double* %c to <2 x double>*
> store <2 x double> %4, <2 x double>* %5, align 8
> ret void
> }
>>
>> Attached is the suggested patch to lib/Transforms/Vectorize/LoopVectorize.cpp ( & test/Transforms/LoopVectorize/xcore/no-vector-registers.ll)
>>
>> However, I also need to commit additions to the xcore target viz: lib/Target/XCore/XCoreTargetTransformInfo.cpp et al.
>> 1) should I make a xcore target commit first & separately?
>
> I think your loop vectorizer commit is fine since you are adding XCoreTargetTransformInfo specifically for this.
>
>> 2) what tests should I be considering for the XCoreTargetTransformInfo tests?
>
> We have test/Analysis/CostModel to test costs of instructions, but they make only sense for you if you would be vectorizing.
>
>
>> (I will further enharnce XCoreTargetTransformInfo in later commits)
>>
>> Sorry for all the basic questions - still finding my way.
>
> np.
>
>> Thank you
>>
>> Robert
>> ________________________________________
>> From: Arnold Schwaighofer [aschwaighofer at apple.com]
>> Sent: 09 September 2013 19:19
>> To: Robert Lytton
>> Cc: Rafael Espíndola; Nadav Rotem; cfe-commits at cs.uiuc.edu
>> Subject: Re: XCore target: disable vectorization
>>
>> On Sep 9, 2013, at 12:49 PM, Robert Lytton <robert at xmos.com> wrote:
>>
>>> Hi Arnold,
>>>
>>> In my mind there seems to be two changes needed.
>>
>> No, with the TTI change you only need an llvm change - no clang change. clang will add the slp/loop-vectorizer but once they execute and ask the target whether they should vectorize (“TTI->getNumberOfRegisters(true)”) these passes immediately give up.
>>
>>>
>>> 1) llvm to check the number of vector registers - as per the direction of this thread.
>>> Will I also need something similar in BBVectorize? (see below).
>>> I need to create some tests for the changes - not so straight forward!
>>
>>
>> You would create a test in "test/Transforms/LoopVectorize/XCore” and “test/Transforms/SLPVectorize/XCore” with “opt -mtriple=XCore- ….” and make sure that you don’t vectorize code you normally would.
>>
>>> When done, I'll post to llvm-commit and cc folk for comment.
>>>
>>> 2) clang not to set the '-vectorize-loops' & '-vectorize-slp' flags
>>> Thus the original patch (with corrected function name) would seem reasonable.
>>> If others agree, I will resubmit the changes.
>>>
>> You would not need this change.
>>
>>> Robert
>>>
>>> --- a/lib/Transforms/Vectorize/BBVectorize.cpp
>>> +++ b/lib/Transforms/Vectorize/BBVectorize.cpp
>>> @@ -397,6 +397,12 @@ namespace {
>>> DEBUG(if (TTI) dbgs() << "BBV: using target information\n");
>>>
>>> bool changed = false;
>>> +
>>> + // If the target claims to have no vector registers don't attempt
>>> + // vectorization.
>>> + if (TTI && !TTI->getNumberOfRegisters(true))
>>> + return false;
>>> +
>>> // Iterate a sufficient number of times to merge types of size 1 bit,
>>
>> <PatchVectorRegisters>
>
More information about the llvm-commits
mailing list