delinearization
Tobias Grosser
tobias at grosser.es
Wed Apr 9 23:46:44 PDT 2014
On 04/09/2014 09:33 PM, Sebastian Pop wrote:
> Tobias Grosser wrote:
>> >On 04/08/2014 11:38 PM, Sebastian Pop wrote:
>>> > >Tobias Grosser wrote:
>>>> > >>On 04/04/2014 11:06 PM, Sebastian Pop wrote:
>>>>> > >>>Hi,
>>>>> > >>>
>>>>> > >>>here is the updated version of the patch that passes all the make check-polly
>>>>> > >>>tests: yay! (I don't have cloog around, so I haven't checked those tests.)
>>>> > >>
>>>> > >>The cloog tests need to be updated. There are three solutions
>>>> > >>
>>>> > >>1) We write a clang-cmp tool that does this semantically
>>>> > >>2) You install it briefly and update the tests
>>> > >
>>> > >I installed cloog and checked with it before I committed the patch.
>>> > >All tests are passing.
>>> > >
>>> > >However, when I enable delinearization by default, 2 cloog tests are failing:
>>> > >Tobi, could you please have a look at these two?
>>> > >
>>> > >test/Cloog/CodeGen/matmul_vec.ll
>>> > >test/Cloog/CodeGen/MemAccess/codegen_simple_md.ll
>>> > >
>>> > >
>>> > >These are failing with a JScop error complaining about inconsistent array access
>>> > >functions (or something else...)
>> >
>> >Yes, I looked into them. We can fix them by updating the .jscop files.
>> >
>> >When looking into these test cases, I have seen that you also
>> >delinearize access functions to fixed size, non linear arrays. This
>> >is very tempting, but for test/Cloog/CodeGen/matmul_vec.ll this
>> >currently does not work correctly.
>> >
>> >MemRef_A[i0 + 1024i2]
>> >
>> >is translated to
>> >
>> >MemRef_A[i0, 1024i2]
>> >
>> >What is the reason to even try to delinearize non-parametric
>> >accesses? Doing so is not necessary to increase the precision of our
>> >analysis and if we get it wrong it has actually negative effects.
>> >Also, I believe delinearizing such accesses is inherently hard. I
>> >personally have the feeling limiting ourselves to delinearize arrays
>> >with symbolic sizes
>> >is a safer bet.
>> >
>>>> > >>3) You commit and I fix later.
>>>> > >>
>>>> > >>I would prefer 1), understand that 2) is easier and if necessary, we
>>>> > >>can also go for 3).
>>>> > >>
>>>>> > >>>The only changes wrt. previous patch are:
>>>>> > >>>- add a flag -polly-delinearize (similar to -da-delinearize)
>>>>> > >>>- normalize the last subscript with respect to the size of elements in the array.
>>>> > >>
>>>> > >>Nice, that this fixed the remaining issues.
>>>> > >>
>>>>> > >>>Ok to commit?
>>>> > >>
>>>> > >>Yes, despite the CLooG test cases this looks good.
>>> > >
>>> > >No cloog tests were failing on my side, so our build bots should be happy with
>>> > >my last commit.
>>> > >
>>> > >I have started triaging the bugs in the nightly test-suite when enabling
>>> > >-polly-delinearization. I have fixed a couple of bugs already: that just shows
>>> > >that all this code was never really tested... I will continue a bit on fixing
>>> > >everything that fails in SCEV->delinearize and on the -polly-delinearize side.
>> >
>> >The delinearization bug I opened is still open.
>> >
>> >I looked at the way we do delinearization and especially at examples
>> >of non-static size arrays. I attached the following example:
>> >
>> >; void foo(long n, long m, double A[n][32]) {
>> >; for (long i = 0; i < n; i++)
>> >; for (long j = 0; j < m; j++)
>> >; A[i][j] = A[i][j+i];
>> >; }
>> >
>> >In the inner loop, we get the following delinearization:
>> >
>> >Inst: %val = load double* %arrayidx.load
>> >In Loop with Header: for.j
>> >AddRec: {{%A,+,(33 * sizeof(double))}<%for.i>,+,sizeof(double)}<%for.j>
>> >Base offset: %A
>> >ArrayDecl[UnknownSize][33] with elements of sizeof(double) bytes.
>> >ArrayRef[{0,+,1}<nuw><nsw><%for.i>][{0,+,1}<nuw><nsw><%for.j>]
>> >
>> >Inst: store double %val, double* %arrayidx.store
>> >In Loop with Header: for.j
>> >AddRec: {{%A,+,(32 * sizeof(double))}<%for.i>,+,sizeof(double)}<%for.j>
>> >Base offset: %A
>> >ArrayDecl[UnknownSize][32] with elements of sizeof(double) bytes.
>> >ArrayRef[{0,+,1}<nuw><nsw><%for.i>][{0,+,1}<nuw><nsw><%for.j>]
>> >
>> >For the load we derive a size of 33 elements in the inner dimension,
>> >for the store a size of 32 elements. This is inconsistent and to me
>> >it is unclear how to derive a common delinearization. For symbolic
>> >sizes the problem should be a lot simpler.
>> >
> Like this?
Yes, this makes sense.
However, I think we may possibly also need to adapt the delinearization
itself. Though this should not hold this patch to go in.
Tobias
>
> Thanks,
> Sebastian
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
>
>
> 0001-only-delinearize-when-the-access-function-is-not-aff.patch
>
>
> From 1928bf48ec4649ff73e6510f0aebd8c92b10b12a Mon Sep 17 00:00:00 2001
> From: Sebastian Pop<spop at codeaurora.org>
> Date: Wed, 9 Apr 2014 14:21:18 -0500
> Subject: [PATCH] only delinearize when the access function is not affine
>
> ---
> lib/Analysis/ScopDetection.cpp | 17 ++++++----
> lib/Analysis/TempScopInfo.cpp | 30 ++++++++++---------
> .../OpenMP/nested_loop_both_parallel_parametric.ll | 21 ++++++++-----
> test/ScopInfo/constant_start_integer.ll | 4 +-
> 4 files changed, 41 insertions(+), 31 deletions(-)
>
> diff --git a/lib/Analysis/ScopDetection.cpp b/lib/Analysis/ScopDetection.cpp
> index 5f03e7a..9ce0fc7 100644
> --- a/lib/Analysis/ScopDetection.cpp
> +++ b/lib/Analysis/ScopDetection.cpp
> @@ -364,12 +364,19 @@ bool ScopDetection::isValidMemoryAccess(Instruction &Inst,
> return invalid<ReportVariantBasePtr>(Context, /*Assert=*/false, BaseValue);
>
> AccessFunction = SE->getMinusSCEV(AccessFunction, BasePointer);
> - const SCEVAddRecExpr *AF = dyn_cast<SCEVAddRecExpr>(AccessFunction);
>
> if (AllowNonAffine) {
> // Do not check whether AccessFunction is affine.
> - } else if (PollyDelinearize && AF) {
> - // Try to delinearize AccessFunction.
> + } else if (!isAffineExpr(&Context.CurRegion, AccessFunction, *SE,
> + BaseValue)) {
> + const SCEVAddRecExpr *AF = dyn_cast<SCEVAddRecExpr>(AccessFunction);
> + if (!PollyDelinearize || !AF)
> + return invalid<ReportNonAffineAccess>(Context, /*Assert=*/true,
> + AccessFunction);
> +
> + // Try to delinearize AccessFunction only when the expression is known to
> + // not be affine: as all affine functions can be represented without
> + // problems in Polly, we do not have to delinearize them.
> SmallVector<const SCEV *, 4> Subscripts, Sizes;
> AF->delinearize(*SE, Subscripts, Sizes);
> int size = Subscripts.size();
> @@ -378,10 +385,6 @@ bool ScopDetection::isValidMemoryAccess(Instruction &Inst,
> if (!isAffineExpr(&Context.CurRegion, Subscripts[i], *SE, BaseValue))
> return invalid<ReportNonAffineAccess>(Context, /*Assert=*/true,
> AccessFunction);
> - } else if (!isAffineExpr(&Context.CurRegion, AccessFunction, *SE,
> - BaseValue)) {
> - return invalid<ReportNonAffineAccess>(Context, /*Assert=*/true,
> - AccessFunction);
> }
>
> // FIXME: Alias Analysis thinks IntToPtrInst aliases with alloca instructions
> diff --git a/lib/Analysis/TempScopInfo.cpp b/lib/Analysis/TempScopInfo.cpp
> index 3b2a451..5579626 100644
> --- a/lib/Analysis/TempScopInfo.cpp
> +++ b/lib/Analysis/TempScopInfo.cpp
> @@ -169,30 +169,32 @@ IRAccess TempScopInfo::buildIRAccess(Instruction *Inst, Loop *L, Region *R) {
> AccessFunction = SE->getMinusSCEV(AccessFunction, BasePointer);
> SmallVector<const SCEV *, 4> Subscripts, Sizes;
>
> - bool IsAffine = true;
> + bool IsAffine = isAffineExpr(R, AccessFunction, *SE, BasePointer->getValue());
> const SCEVAddRecExpr *AF = dyn_cast<SCEVAddRecExpr>(AccessFunction);
>
> - if (PollyDelinearize && AF) {
> + if (!IsAffine && PollyDelinearize && AF) {
> const SCEV *Remainder = AF->delinearize(*SE, Subscripts, Sizes);
> int NSubs = Subscripts.size();
>
> - // Normalize the last dimension: integrate the size of the "scalar
> - // dimension" and the remainder of the delinearization.
> - Subscripts[NSubs - 1] =
> - SE->getMulExpr(Subscripts[NSubs - 1], Sizes[NSubs - 1]);
> - Subscripts[NSubs - 1] = SE->getAddExpr(Subscripts[NSubs - 1], Remainder);
> -
> - for (int i = 0; i < NSubs; ++i)
> - if (!isAffineExpr(R, Subscripts[i], *SE, BasePointer->getValue())) {
> - IsAffine = false;
> - break;
> - }
> + if (NSubs > 0) {
> + // Normalize the last dimension: integrate the size of the "scalar
> + // dimension" and the remainder of the delinearization.
> + Subscripts[NSubs - 1] =
> + SE->getMulExpr(Subscripts[NSubs - 1], Sizes[NSubs - 1]);
> + Subscripts[NSubs - 1] = SE->getAddExpr(Subscripts[NSubs - 1], Remainder);
> +
> + IsAffine = true;
> + for (int i = 0; i < NSubs; ++i)
> + if (!isAffineExpr(R, Subscripts[i], *SE, BasePointer->getValue())) {
> + IsAffine = false;
> + break;
> + }
> + }
> }
>
> if (Subscripts.size() == 0) {
> Subscripts.push_back(AccessFunction);
> Sizes.push_back(SE->getConstant(ZeroOffset->getType(), Size));
> - IsAffine = isAffineExpr(R, AccessFunction, *SE, BasePointer->getValue());
> }
>
> return IRAccess(Type, BasePointer->getValue(), AccessFunction, Size, IsAffine,
> diff --git a/test/Isl/Ast/OpenMP/nested_loop_both_parallel_parametric.ll b/test/Isl/Ast/OpenMP/nested_loop_both_parallel_parametric.ll
> index 2c11fc5..39b9108 100644
More information about the llvm-commits
mailing list