[PATCH] [RFC] Move to schedule trees to store our schedule

Tobias Grosser tobias at grosser.es
Thu Apr 23 08:15:22 PDT 2015


Hi sebpop, simbuerg, jdoerfert, zinob,

Instead of modeling the schedule (execution order) of a piece of code as a flat
multi-dimensional mapping, we now retain the inherent tree structure of the
schedule. This makes the schedule a little bit more verbose, but the exposed
structure makes it both easier to work with and also faster to process.

Specifically, we can now easily walk the schedule tree to decide where to tile
loops, apply register tiling at the inner level and also perform full/partial
tile separation a lot easier.

This patch does not yet exploit these capabilities, but is mostly a 1-to-1
translation from flat schedules to schedule trees.

For more information see:

impact.gforge.inria.fr/impact2014/papers/impact2014-verdoolaege.pdf

XXX: This patch is not yet finished, but just a proof of concept.

Open Issues
-----------

  - The dependence analysis for reductions does not yet work. Specifically,
    this patch does not yet support the computation of reduction dependences.
    The current approach uses flat schedules extensively. It might make sense
    to first change the way we model reductions (as may-writes as suggested by
    Sven) and only then add this patch.

Performance measurements
------------------------
                                 0           1         2                 3
                               Before      Patch   Patch+64bit-int    NoPolly
gemm:                          121ms        92ms      74ms             46ms
3mm:                           600ms       340ms     203ms             60ms
NAS Parallel Benchmark (bt):      6m+      1.2s      627ms             80ms

The performance compares

  0) Polly trunk
  1) Polly + this patch
  2) Polly + this patch + 64bit integers
  3) clang -O3 without Polly

The NAS benchmark seems to still have a performance problem. Adding/deriving
run-time assumptions seems to take a significant amount of time, making ScopInfo
with about 75% the most costly pass, whereas it commonly does not show up high
on the profile.

Example
=======

Source Code
-----------

     for (long i = 0; i < 100; i++) {
S1:    A[i] += 1;
       for (long j = 0; j < 100; j++)
S2:      A[i+j] += 1;
S3:    A[i] += 1;
    }

Flat Schedule
-------------

{
  Stmt_S1[i0]     -> [i0, 0, 0]: i0 >= 0 and i0 <= 99;
  Stmt_S2[i0, i1] -> [i0, 1, i1]: i0 >= 0 and i0 <= 99 and i1 >= 0 and i1 <= 99
  Stmt_S3[i0]     -> [i0, 2, 0]: i0 >= 0 and i0 <= 99
}

Schedule Tree
-------------

domain: "{ Stmt_S1[i0] : i0 >= 0 and i0 <= 99;
           Stmt_S2[i0, i1] : i0 >= 0 and i0 <= 99 and i1 >= 0 and i1 <= 99;
           Stmt_S3[i0] : i0 >= 0 and i0 <= 99 }"
child:
  schedule: "[{ Stmt_S1[i0] -> [(i0)];
                Stmt_S2[i0, i1] -> [(i0)];
                Stmt_S3[i0] -> [(i0)];
              }]"
  child:
    sequence:
    - filter: "{ Stmt_S1[i0] }"
    - filter: "{ Stmt_S2[i0, i1] }"
      child:
        schedule: "[{ Stmt_S2[i0, i1] -> [(i1)] }]"
    - filter: "{ Stmt_S3[i0] }"

http://reviews.llvm.org/D9226

Files:
  include/polly/ScopInfo.h
  lib/Analysis/DependenceInfo.cpp
  lib/Analysis/ScopInfo.cpp
  lib/CodeGen/IslAst.cpp
  lib/CodeGen/IslCodeGeneration.cpp
  lib/Exchange/JSONExporter.cpp
  lib/Transform/ScheduleOptimizer.cpp
  test/DeadCodeElimination/chained_iterations.ll
  test/DeadCodeElimination/chained_iterations_2.ll
  test/DeadCodeElimination/computeout.ll
  test/DeadCodeElimination/dead_iteration_elimination.ll
  test/DeadCodeElimination/non-affine-affine-mix.ll
  test/DeadCodeElimination/null_schedule.ll
  test/Isl/Ast/reduction_modulo_schedule_multiple_dimensions_4.ll
  test/Isl/Ast/simple-run-time-condition.ll
  test/Isl/CodeGen/loop_with_condition.ll
  test/Isl/CodeGen/loop_with_condition_2.ll
  test/Isl/CodeGen/loop_with_condition_ineq.ll
  test/Isl/CodeGen/loop_with_condition_nested.ll
  test/Isl/CodeGen/sequential_loops.ll
  test/ScheduleOptimizer/computeout.ll
  test/ScheduleOptimizer/line-tiling.ll
  test/ScheduleOptimizer/one-dimensional-band.ll
  test/ScheduleOptimizer/prevectorization.ll
  test/ScheduleOptimizer/rectangular-tiling.ll
  test/ScopInfo/NonAffine/non-affine-loop-condition-dependent-access_2.ll
  test/ScopInfo/NonAffine/non_affine_loop_used_later.ll
  test/ScopInfo/pointer-type-expressions.ll

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D9226.24306.patch
Type: text/x-patch
Size: 48869 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150423/0a106e26/attachment.bin>


More information about the llvm-commits mailing list