[PATCH] D125202: [Polly] Disable matmul pattern-match + -polly-parallel

Mon May 9 19:11:02 PDT 2022

Meinersbur requested changes to this revision.
Meinersbur added a comment.
This revision now requires changes to proceed.

This fixes the problem:

  diff --git a/polly/lib/Transform/ScheduleOptimizer.cpp b/polly/lib/Transform/ScheduleOptimizer.cpp
  index fad62d9b20830..f2cf06af2f7c8 100644
  --- a/polly/lib/Transform/ScheduleOptimizer.cpp
  +++ b/polly/lib/Transform/ScheduleOptimizer.cpp
  @@ -978,7 +978,7 @@ runIslScheduleOptimizerUsingNPM(Scop &S, ScopAnalysisManager &SAM,
     OptimizationRemarkEmitter ORE(&S.getFunction());
     TargetTransformInfo *TTI = &SAR.TTI;
     isl::schedule LastSchedule;
  -  bool Modified = runIslScheduleOptimizer(S, GetDeps, TTI, &ORE, LastSchedule);
  +  runIslScheduleOptimizer(S, GetDeps, TTI, &ORE, LastSchedule);
     if (OS) {
       *OS << "Printing analysis 'Polly - Optimize schedule of SCoP' for region: '"
           << S.getName() << "' in function '" << S.getFunction().getName()
  @@ -986,13 +986,11 @@ runIslScheduleOptimizerUsingNPM(Scop &S, ScopAnalysisManager &SAM,
       runScheduleOptimizerPrinter(*OS, LastSchedule);
     }

  -  if (!Modified)
  -    return PreservedAnalyses::all();
  -
     PreservedAnalyses PA;
     PA.preserveSet<AllAnalysesOn<Module>>();
     PA.preserveSet<AllAnalysesOn<Function>>();
     PA.preserveSet<AllAnalysesOn<Loop>>();
  +  PA.abandon<DependenceAnalysis>();
     return PA;
   }

The optimization adds new statements without updating (or invalidating) DependenceInfo. Unfortunately this now means that DependenceAnalysis will run a second time for AstInfo. Also consider adding a comment to why DependenceAnalysis is abandoned here and the legacy pass manager. Maybe you find a way to only invalidate it when matmul optimization has been applied.

We should also discuss about

  diff --git a/polly/lib/Transform/MatmulOptimizer.cpp b/polly/lib/Transform/MatmulOptimizer.cpp
  index 60dd9eda3c2c0..c46025522bc23 100644
  --- a/polly/lib/Transform/MatmulOptimizer.cpp
  +++ b/polly/lib/Transform/MatmulOptimizer.cpp
  @@ -491,9 +491,6 @@ createMacroKernel(isl::schedule_node Node,
     Node = permuteBandNodeDimensions(Node, DimOutNum - 2, DimOutNum - 1);
     Node = permuteBandNodeDimensions(Node, DimOutNum - 3, DimOutNum - 1);

  -  // Mark the outermost loop as parallelizable.
  -  Node = Node.as<isl::schedule_node_band>().member_set_coincident(0, true);
  -
     return Node.child(0).child(0);
   }

I added this in aa8a976174c7ac08676bbc7bb647f6bc0efd2e72 and I think it does not actually make anything parallel, but I am not sure it is actually allowed due to `Packed_A` shared between all the threads.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125202/new/

https://reviews.llvm.org/D125202