XCore target: disable vectorization
Arnold Schwaighofer
aschwaighofer at apple.com
Tue Sep 10 08:22:44 PDT 2013
On Sep 10, 2013, at 9:25 AM, Robert Lytton <robert at xmos.com> wrote:
> (subject moved to llvm-commits from cfe-commits http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130909/088308.html)
>
> Hi Arnold,
>
> I have looked into getting the xcore target to incorrectly run the slp vectorizer so that I can then 'fix it' but have had no success. Any thoughts?
Did you try one of the examples in e.g. test/Transforms/SLPVectorizer/X86/ like simplebb.ll. You would substitute the triple and data layout. If it still doesn’t slp-vectorize try -mllvm -debug-only=SLP. The following works for me:
$ cat > simplebb.ll
; RUN: opt < %s -basicaa -slp-vectorizer -dce -S -mtriple=xcore | FileCheck %s
target datalayout =
"e-p:32:32:32-a0:0:32-n32-i1:8:32-i8:8:32-i16:16:32-i32:32:32-i64:32:32-f16:16:32-f32:32:32-f64:32:32"
target triple = "xcore"
; Simple 3-pair chain with loads and stores
; CHECK: test1
; CHECK: store <2 x double>
; CHECK: ret
define void @test1(double* %a, double* %b, double* %c) {
entry:
%i0 = load double* %a, align 8
%i1 = load double* %b, align 8
%mul = fmul double %i0, %i1
%arrayidx3 = getelementptr inbounds double* %a, i64 1
%i3 = load double* %arrayidx3, align 8
%arrayidx4 = getelementptr inbounds double* %b, i64 1
%i4 = load double* %arrayidx4, align 8
%mul5 = fmul double %i3, %i4
store double %mul, double* %c, align 8
%arrayidx5 = getelementptr inbounds double* %c, i64 1
store double %mul5, double* %arrayidx5, align 8
ret void
}
$ opt -basicaa -slp-vectorizer -dce -S -mtriple=xcore < simplebb.ll -debug-only=SLP
…
SLP: Decided to vectorize cost=-6
SLP: Extracting 0 values .
SLP: Erasing scalar: store double %mul, double* %c, align 8.
SLP: Erasing scalar: store double %mul5, double* %arrayidx5, align 8.
SLP: Erasing scalar: %mul = fmul double %i0, %i1.
SLP: Erasing scalar: %mul5 = fmul double %i3, %i4.
SLP: Erasing scalar: %i0 = load double* %a, align 8.
SLP: Erasing scalar: %i3 = load double* %arrayidx3, align 8.
SLP: Erasing scalar: %i1 = load double* %b, align 8.
SLP: Erasing scalar: %i4 = load double* %arrayidx4, align 8.
SLP: Optimizing 0 gather sequences instructions.
SLP: vectorized "test1"
; ModuleID = '<stdin>'
target datalayout = "e-p:32:32:32-a0:0:32-n32-i1:8:32-i8:8:32-i16:16:32-i32:32:32-i64:32:32-f16:16:32-f32:32:32-f64:32:32"
target triple = "xcore"
define void @test1(double* %a, double* %b, double* %c) {
entry:
%0 = bitcast double* %a to <2 x double>*
%1 = load <2 x double>* %0, align 8
%2 = bitcast double* %b to <2 x double>*
%3 = load <2 x double>* %2, align 8
%4 = fmul <2 x double> %1, %3
%5 = bitcast double* %c to <2 x double>*
store <2 x double> %4, <2 x double>* %5, align 8
ret void
}
>
> Attached is the suggested patch to lib/Transforms/Vectorize/LoopVectorize.cpp ( & test/Transforms/LoopVectorize/xcore/no-vector-registers.ll)
>
> However, I also need to commit additions to the xcore target viz: lib/Target/XCore/XCoreTargetTransformInfo.cpp et al.
> 1) should I make a xcore target commit first & separately?
I think your loop vectorizer commit is fine since you are adding XCoreTargetTransformInfo specifically for this.
> 2) what tests should I be considering for the XCoreTargetTransformInfo tests?
We have test/Analysis/CostModel to test costs of instructions, but they make only sense for you if you would be vectorizing.
> (I will further enharnce XCoreTargetTransformInfo in later commits)
>
> Sorry for all the basic questions - still finding my way.
np.
> Thank you
>
> Robert
> ________________________________________
> From: Arnold Schwaighofer [aschwaighofer at apple.com]
> Sent: 09 September 2013 19:19
> To: Robert Lytton
> Cc: Rafael Espíndola; Nadav Rotem; cfe-commits at cs.uiuc.edu
> Subject: Re: XCore target: disable vectorization
>
> On Sep 9, 2013, at 12:49 PM, Robert Lytton <robert at xmos.com> wrote:
>
>> Hi Arnold,
>>
>> In my mind there seems to be two changes needed.
>
> No, with the TTI change you only need an llvm change - no clang change. clang will add the slp/loop-vectorizer but once they execute and ask the target whether they should vectorize (“TTI->getNumberOfRegisters(true)”) these passes immediately give up.
>
>>
>> 1) llvm to check the number of vector registers - as per the direction of this thread.
>> Will I also need something similar in BBVectorize? (see below).
>> I need to create some tests for the changes - not so straight forward!
>
>
> You would create a test in "test/Transforms/LoopVectorize/XCore” and “test/Transforms/SLPVectorize/XCore” with “opt -mtriple=XCore- ….” and make sure that you don’t vectorize code you normally would.
>
>> When done, I'll post to llvm-commit and cc folk for comment.
>>
>> 2) clang not to set the '-vectorize-loops' & '-vectorize-slp' flags
>> Thus the original patch (with corrected function name) would seem reasonable.
>> If others agree, I will resubmit the changes.
>>
> You would not need this change.
>
>> Robert
>>
>> --- a/lib/Transforms/Vectorize/BBVectorize.cpp
>> +++ b/lib/Transforms/Vectorize/BBVectorize.cpp
>> @@ -397,6 +397,12 @@ namespace {
>> DEBUG(if (TTI) dbgs() << "BBV: using target information\n");
>>
>> bool changed = false;
>> +
>> + // If the target claims to have no vector registers don't attempt
>> + // vectorization.
>> + if (TTI && !TTI->getNumberOfRegisters(true))
>> + return false;
>> +
>> // Iterate a sufficient number of times to merge types of size 1 bit,
>
> <PatchVectorRegisters>
More information about the llvm-commits
mailing list