[polly] r256199 - Add option to assume single-loop scops with sufficient compute are profitable
Johannes Doerfert via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 22 03:09:48 PST 2015
In this spirit I would like to change the default heuristic to:
#loops + #blocks > 2
as that is the condition under which we can perform "by default enabled"
optimizations.
On 12/21, Tobias Grosser via llvm-commits wrote:
> Author: grosser
> Date: Mon Dec 21 15:00:43 2015
> New Revision: 256199
>
> URL: http://llvm.org/viewvc/llvm-project?rev=256199&view=rev
> Log:
> Add option to assume single-loop scops with sufficient compute are profitable
>
> If a loop has a sufficiently large amount of compute instruction in its loop
> body, it is unlikely that our rewrite of the loop iterators introduces large
> performance changes. As Polly can also apply beneficical optimizations (such
> as parallelization) to such loop nests, we mark them as profitable.
>
> This option is currently "disabled" by default, but can be used to run
> experiments. If enabled by setting it e.g. to 40 instructions, we currently
> see some compile-time increases on LNT without any significant run-time
> changes.
>
> Added:
> polly/trunk/test/ScopDetect/profitability-large-basic-blocks.ll
> Modified:
> polly/trunk/include/polly/ScopDetection.h
> polly/trunk/lib/Analysis/ScopDetection.cpp
>
> Modified: polly/trunk/include/polly/ScopDetection.h
> URL: http://llvm.org/viewvc/llvm-project/polly/trunk/include/polly/ScopDetection.h?rev=256199&r1=256198&r2=256199&view=diff
> ==============================================================================
> --- polly/trunk/include/polly/ScopDetection.h (original)
> +++ polly/trunk/include/polly/ScopDetection.h Mon Dec 21 15:00:43 2015
> @@ -287,6 +287,20 @@ private:
> /// @return True if all blocks in R are valid, false otherwise.
> bool allBlocksValid(DetectionContext &Context) const;
>
> + /// @brief Check if a region has sufficient compute instructions
> + ///
> + /// This function checks if a region has a non-trivial number of instructions
> + /// in each loop. This can be used as an indicator if a loop is worth
> + /// optimising.
> + ///
> + /// @param Context The context of scop detection.
> + /// @param NumLoops The number of loops in the region.
> + ///
> + /// @return True if region is has sufficient compute instructions,
> + /// false otherwise.
> + bool hasSufficientCompute(DetectionContext &Context,
> + int NumAffineLoops) const;
> +
> /// @brief Check if a region is profitable to optimize.
> ///
> /// Regions that are unlikely to expose interesting optimization opportunities
>
> Modified: polly/trunk/lib/Analysis/ScopDetection.cpp
> URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Analysis/ScopDetection.cpp?rev=256199&r1=256198&r2=256199&view=diff
> ==============================================================================
> --- polly/trunk/lib/Analysis/ScopDetection.cpp (original)
> +++ polly/trunk/lib/Analysis/ScopDetection.cpp Mon Dec 21 15:00:43 2015
> @@ -71,6 +71,16 @@ using namespace polly;
>
> #define DEBUG_TYPE "polly-detect"
>
> +// This option is set to a very high value, as analyzing such loops increases
> +// compile time on several cases. For experiments that enable this option,
> +// a value of around 40 has been working to avoid run-time regressions with
> +// Polly while still exposing interesting optimization opportunities.
> +static cl::opt<int> ProfitabilityMinPerLoopInstructions(
> + "polly-detect-profitability-min-per-loop-insts",
> + cl::desc("The minimal number of per-loop instructions before a single loop "
> + "region is considered profitable"),
> + cl::Hidden, cl::ValueRequired, cl::init(100000000), cl::cat(PollyCategory));
> +
> bool polly::PollyProcessUnprofitable;
> static cl::opt<bool, true> XPollyProcessUnprofitable(
> "polly-process-unprofitable",
> @@ -1134,6 +1144,19 @@ bool ScopDetection::allBlocksValid(Detec
> return true;
> }
>
> +bool ScopDetection::hasSufficientCompute(DetectionContext &Context,
> + int NumLoops) const {
> + int InstCount = 0;
> +
> + for (auto *BB : Context.CurRegion.blocks())
> + if (Context.CurRegion.contains(LI->getLoopFor(BB)))
> + InstCount += std::distance(BB->begin(), BB->end());
> +
> + InstCount = InstCount / NumLoops;
> +
> + return InstCount >= ProfitabilityMinPerLoopInstructions;
> +}
> +
> bool ScopDetection::isProfitableRegion(DetectionContext &Context) const {
> Region &CurRegion = Context.CurRegion;
>
> @@ -1145,13 +1168,24 @@ bool ScopDetection::isProfitableRegion(D
> if (!Context.hasStores || !Context.hasLoads)
> return invalid<ReportUnprofitable>(Context, /*Assert=*/true, &CurRegion);
>
> - // Check if there are sufficent non-overapproximated loops.
> int NumLoops = countBeneficialLoops(&CurRegion);
> int NumAffineLoops = NumLoops - Context.BoxedLoopsSet.size();
> - if (NumAffineLoops < 2)
> - return invalid<ReportUnprofitable>(Context, /*Assert=*/true, &CurRegion);
>
> - return true;
> + // Scops with at least two loops may allow either loop fusion or tiling and
> + // are consequently interesting to look at.
> + if (NumAffineLoops >= 2)
> + return true;
> +
> + // Scops that contain a loop with a non-trivial amount of computation per
> + // loop-iteration are interesting as we may be able to parallelize such
> + // loops. Individual loops that have only a small amount of computation
> + // per-iteration are performance-wise very fragile as any change to the
> + // loop induction variables may affect performance. To not cause spurious
> + // performance regressions, we do not consider such loops.
> + if (NumAffineLoops == 1 && hasSufficientCompute(Context, NumLoops))
> + return true;
> +
> + return invalid<ReportUnprofitable>(Context, /*Assert=*/true, &CurRegion);
> }
>
> bool ScopDetection::isValidRegion(DetectionContext &Context) const {
>
> Added: polly/trunk/test/ScopDetect/profitability-large-basic-blocks.ll
> URL: http://llvm.org/viewvc/llvm-project/polly/trunk/test/ScopDetect/profitability-large-basic-blocks.ll?rev=256199&view=auto
> ==============================================================================
> --- polly/trunk/test/ScopDetect/profitability-large-basic-blocks.ll (added)
> +++ polly/trunk/test/ScopDetect/profitability-large-basic-blocks.ll Mon Dec 21 15:00:43 2015
> @@ -0,0 +1,83 @@
> +; RUN: opt %loadPolly -polly-process-unprofitable=false \
> +; RUN: -polly-detect-profitability-min-per-loop-insts=40 \
> +; RUN: -polly-detect -analyze < %s | FileCheck %s -check-prefix=PROFITABLE
> +
> +; RUN: opt %loadPolly -polly-process-unprofitable=true \
> +; RUN: -polly-detect -analyze < %s | FileCheck %s -check-prefix=PROFITABLE
> +
> +; RUN: opt %loadPolly -polly-process-unprofitable=false \
> +; RUN: \
> +; RUN: -polly-detect -analyze < %s | FileCheck %s -check-prefix=UNPROFITABLE
> +
> +; UNPROFITABLE-NOT: Valid Region for Scop:
> +; PROFITABLE: Valid Region for Scop:
> +
> +; void foo(float *A, float *B, long N) {
> +; for (long i = 0; i < 100; i++)
> +; A[i] += .... / * a lot of compute */
> +; }
> +;
> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
> +
> +define void @foo(float* %A, float* %B, i64 %N) {
> +entry:
> + br label %header
> +
> +header:
> + %i.0 = phi i64 [ 0, %entry ], [ %tmp10, %header ]
> + %tmp5 = sitofp i64 %i.0 to float
> + %tmp6 = getelementptr inbounds float, float* %A, i64 %i.0
> + %tmp7 = load float, float* %tmp6, align 4
> + %tmp8 = fadd float %tmp7, %tmp5
> + %val0 = fadd float %tmp7, 1.0
> + %val1 = fadd float %val0, 1.0
> + %val2 = fadd float %val1, 1.0
> + %val3 = fadd float %val2, 1.0
> + %val4 = fadd float %val3, 1.0
> + %val5 = fadd float %val4, 1.0
> + %val6 = fadd float %val5, 1.0
> + %val7 = fadd float %val6, 1.0
> + %val8 = fadd float %val7, 1.0
> + %val9 = fadd float %val8, 1.0
> + %val10 = fadd float %val9, 1.0
> + %val11 = fadd float %val10, 1.0
> + %val12 = fadd float %val11, 1.0
> + %val13 = fadd float %val12, 1.0
> + %val14 = fadd float %val13, 1.0
> + %val15 = fadd float %val14, 1.0
> + %val16 = fadd float %val15, 1.0
> + %val17 = fadd float %val16, 1.0
> + %val18 = fadd float %val17, 1.0
> + %val19 = fadd float %val18, 1.0
> + %val20 = fadd float %val19, 1.0
> + %val21 = fadd float %val20, 1.0
> + %val22 = fadd float %val21, 1.0
> + %val23 = fadd float %val22, 1.0
> + %val24 = fadd float %val23, 1.0
> + %val25 = fadd float %val24, 1.0
> + %val26 = fadd float %val25, 1.0
> + %val27 = fadd float %val26, 1.0
> + %val28 = fadd float %val27, 1.0
> + %val29 = fadd float %val28, 1.0
> + %val30 = fadd float %val29, 1.0
> + %val31 = fadd float %val30, 1.0
> + %val32 = fadd float %val31, 1.0
> + %val33 = fadd float %val32, 1.0
> + %val34 = fadd float %val33, 1.0
> + %val35 = fadd float %val34, 1.0
> + %val36 = fadd float %val35, 1.0
> + %val37 = fadd float %val36, 1.0
> + %val38 = fadd float %val37, 1.0
> + %val39 = fadd float %val38, 1.0
> + %val40 = fadd float %val39, 1.0
> + %val41 = fadd float %val40, 1.0
> + %val42 = fadd float %val41, 1.0
> + %val43 = fadd float %val42, 1.0
> + store float %val34, float* %tmp6, align 4
> + %exitcond = icmp ne i64 %i.0, 100
> + %tmp10 = add nsw i64 %i.0, 1
> + br i1 %exitcond, label %header, label %exit
> +
> +exit:
> + ret void
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
--
Johannes Doerfert
Researcher / PhD Student
Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31
Tel. +49 (0)681 302-57521 : doerfert at cs.uni-saarland.de
Fax. +49 (0)681 302-3065 : http://www.cdl.uni-saarland.de/people/doerfert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 213 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151222/b938bced/attachment.sig>
More information about the llvm-commits
mailing list