[PATCH] D94973: [clang][OpenMP] Use OpenMPIRBuilder for workshare loops.

Joel E. Denny via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 25 17:38:25 PST 2021


jdenny accepted this revision.
jdenny added a comment.
This revision is now accepted and ready to land.

Other than the additional comments I've requested, LGTM.  Thanks for the explanations and the changes.



================
Comment at: clang/lib/Sema/TreeTransform.h:8321
+TreeTransform<Derived>::TransformOMPCanonicalLoop(OMPCanonicalLoop *L) {
+  // The OMPCanonicalLoop will be recreated when transforming the loop-associted
+  // directive.
----------------
Meinersbur wrote:
> jdenny wrote:
> > Meinersbur wrote:
> > > jdenny wrote:
> > > > Meinersbur wrote:
> > > > > jdenny wrote:
> > > > > > I'm used to seeing `TransformX` call `RebuildX` call `ActOnX`.  Why not do that for `X=OMPCanonicalLoop`?  Does `TransformOMPExecutableDirective` really need a special case for `OMPCanonicalLoop`?
> > > > > The intended behaviour is: 
> > > > > 
> > > > > 1. Transform the child loop
> > > > > 2. Return the child loop as representing the OMPCanonicalLoop (i.e. OMPCanonicalLoop wrapper is removed).
> > > > > 3. Parent loop-associated directive (e.g. workshare) is processed.
> > > > > 4. `ActOnOpenMPCanonicalLoop` adds new OMPCanonicalLoop wrapper for loop nest.
> > > > > 
> > > > > This guarantees maximum flexibility on what of the loop can be changed, such as
> > > > > * Change lower bound, upper bound, step
> > > > > * Convert between CXXForRangeStmt and ForStmt
> > > > > * Change the associated depth (e.g. different value for `collapse` clause)
> > > > > * Remove the directive and no OMPCanonicalLoop remain
> > > > > 
> > > > > This also saves adding many lines of code handling transforming each member of OMPCanonicalLoop separately.
> > > > > The intended behaviour is: 
> > > > > 
> > > > > 1. Transform the child loop
> > > > 
> > > > For  my suggestion, that call would remain within `TransformOMPCanonicalLoop` where it is now.
> > > > 
> > > > > 2. Return the child loop as representing the OMPCanonicalLoop (i.e. OMPCanonicalLoop wrapper is removed).
> > > > 
> > > > For my suggestion, this would not happen.  I think it's normal for a `TransformX` to return the transformed `X` not the transformed child of `X`.  If a caller wants to transform the child, then it should transform the child directly instead.
> > > > 
> > > > > 3. Parent loop-associated directive (e.g. workshare) is processed.
> > > > 
> > > > It looks to me like step 3 is currently within `TransformOMPExecutableDirective` and starts before the call to `TranformOMPCanonicalLoop` and thus before step 1.  It completes after step 4.  Or am I misunderstanding what you're describing as step 3?
> > > > 
> > > > > 4. `ActOnOpenMPCanonicalLoop` adds new OMPCanonicalLoop wrapper for loop nest.
> > > > 
> > > > For my suggestion, this would still happen.  However, instead of step 2: within `TransformOMPCanonicalLoop`, you would call `RebuildOMPCanonicalLoop`, which would call `ActOnOpenMPCanonicalLoop` as step 4.  The effect is you moved `ActOnOpenMPCanonicalLoop` from the caller (`TransformOMPExecutableDirective`) to the callee's callee, but the behavior should remain the same.
> > > > 
> > > > > This guarantees maximum flexibility on what of the loop can be changed, such as
> > > > > * Change lower bound, upper bound, step
> > > > > * Convert between CXXForRangeStmt and ForStmt
> > > > > * Change the associated depth (e.g. different value for `collapse` clause)
> > > > > * Remove the directive and no OMPCanonicalLoop remain
> > > > 
> > > > Flexibility for whom?
> > > > 
> > > > A class extending `TreeTransform`?  With my suggestion, it can override `TransformOMPCanonicalLoop` or `RebuildOMPCanonicalLoop`, depending on how much it wants to alter the transformation.
> > > > 
> > > > Or a caller of `TransformOMPCanonicalLoop` within `TreeTransform`?  I only see one right now, `TransformOMPExecutableDirective`, and I don't see how it needs the flexibility.  Are there other callers I missed?
> > > > 
> > > > Are you trying to create flexibility without requiring deriving from `TreeTransform`?  But, as far as I can tell, you're doing so at the expense of normal `TreeTransform` semantics.  Doesn't seem worth it.
> > > > 
> > > > If you see a precedent for your approach elsewhere in `TreeTransform`, please point it out.
> > > > 
> > > > > This also saves adding many lines of code handling transforming each member of OMPCanonicalLoop separately.
> > > > 
> > > > Why would you need to?  In the other `TransformX` functions I looked at, the arguments to the `RebuildX` function are transformed, and those are typically just the arguments to the `ActOnX` function.  In other words, you would just transform the loop within your `OMPCanonicalLoop` as you're doing now.
> > > I could not find where you outlined your solution, I tried to infer it from your comments. 
> > > 
> > > > I'm used to seeing TransformX call RebuildX call ActOnX. 
> > > 
> > > Introduced new `RebuildOMPCanonicalLoop` that does nothing else then calling `ActOnOMPCanonicalLoop` from the Sema object, like e.g.  `RebuildOMPExecutableDirective` does. This allows overriding `RebuildOMPCanonicalLoop` in a derived transform.
> > > 
> > > 
> > > > Does TransformOMPExecutableDirective really need a special case for OMPCanonicalLoop?
> > > 
> > > As it does for atomic, critical, section and master.
> > > 
> > > 
> > > > Flexibility for whom? A class extending TreeTransform?
> > > 
> > > Yes
> > > 
> > > 
> > > > If you see a precedent for your approach elsewhere in TreeTransform, please point it out.
> > > 
> > > The precedence is `TransformOMPExecutableDirective` it unwraps the CapturedStmt to get its body code. The following `TransformStmt` of the unwrapped Stmt never calls `TransfomCapturedStmt` since it was already unwrapped. The re-wrapping is done by the surrounding `ActOnOpenMPRegionStart`/`ActOnOpenMPRegionEnd`.
> > > 
> > > To reproduce this, I changed `TransformOMPExecutableDirective` to also unwrap the `OMPCanonicalLoop` instead of doing so implicitly when `TransformStmt` calls `TransformOMPCanonicalLoop`. As a result, `TransformOMPCanonicalLoop` (like `TransfomCapturedStmt` for OpenMP directives; maybe `TransfomCapturedStmt` is called for other purposes) should never be called and I put an assertion there instead.
> > > 
> > > Note that it is not required for a `TransformXYZ` method to exactly return the type in its name, for instance `TransformUnresolvedLookupExpr` may return whatever the template instantiation resolves to.
> > > I could not find where you outlined your solution, I tried to infer it from your comments. 
> > 
> > Sorry for the confusion.
> > 
> > I've added suggested edits to show how to tweak the patch into what I'm suggesting.  I applied the result to my checkout, and check-clang-openmp still passes.
> > 
> > I think my suggestion is more consistent with existing TreeTransform design.  If it won't work for your use cases, please help me to understand why.
> > 
> > > > I'm used to seeing TransformX call RebuildX call ActOnX. 
> > > 
> > > Introduced new `RebuildOMPCanonicalLoop` that does nothing else then calling `ActOnOMPCanonicalLoop` from the Sema object, like e.g.  `RebuildOMPExecutableDirective` does. This allows overriding `RebuildOMPCanonicalLoop` in a derived transform.
> > 
> > Thanks.  That part looks good.
> > 
> > > > Does TransformOMPExecutableDirective really need a special case for OMPCanonicalLoop?
> > > 
> > > As it does for atomic, critical, section and master.
> > 
> > With my suggestion, the special case for OMPCanonicalLoop doesn't seem necessary.
> > 
> > > > Flexibility for whom? A class extending TreeTransform?
> > > 
> > > Yes
> > 
> > After applying my suggestion, a class extending TreeTransform can override TransformOMPCanonicalLoop or RebuildOMPCanonicalLoop.  If that's not sufficient, please help me to understand the use cases that require more flexibility.
> > 
> > > > If you see a precedent for your approach elsewhere in TreeTransform, please point it out.
> > > 
> > > The precedence is `TransformOMPExecutableDirective` it unwraps the CapturedStmt to get its body code. The following `TransformStmt` of the unwrapped Stmt never calls `TransfomCapturedStmt` since it was already unwrapped. The re-wrapping is done by the surrounding `ActOnOpenMPRegionStart`/`ActOnOpenMPRegionEnd`.
> > > 
> > > To reproduce this, I changed `TransformOMPExecutableDirective` to also unwrap the `OMPCanonicalLoop` instead of doing so implicitly when `TransformStmt` calls `TransformOMPCanonicalLoop`. As a result, `TransformOMPCanonicalLoop` (like `TransfomCapturedStmt` for OpenMP directives; maybe `TransfomCapturedStmt` is called for other purposes) should never be called and I put an assertion there instead.
> > 
> > Yes, I'm assuming TransformCapturedStmt is needed for other purposes and its implementation doesn't satisfy TransformOMPExecutableDirective's needs.  In contrast, I think the TransformOMPCanonicalLoop I've suggested seems fine for TransformOMPExecutableDirective's needs, and it follows the usual behavior of any TransformXYZ.
> > 
> > > Note that it is not required for a `TransformXYZ` method to exactly return the type in its name, for instance `TransformUnresolvedLookupExpr` may return whatever the template instantiation resolves to.
> > 
> > Understood, but I'm not aware of a case where TransformXYZ returns the transformation of the child and leaves it up to the caller to finish the transformation of XYZ itself.  The type that results from that transformation is a different issue.
> Thanks for you proposed edits.
> 
> I still think think we should we should handle this like TransformOMPExecutableDirective handles CapturedStmt, because (1) OMPExecutableDirective is in control of how its children is structured, (2) TransformOMPExecutableDirective  already does this this way (3) derived TreeTransform can change more aspects of a loop/loop-associated directive without having to override TransformOMPCanonicalLoop/TransformOMPExecutableDirective (4) fewer risk of derived TreeTransform to leave the AST in an inconsistent state and (5) is will not work when the loop depth is dependent on a template.
> 
> To elaborate on (5):
> ```
> template<int n>
> void func() {
>    #pragma omp for collapse(n)
>    for (...
>        for (...
> }
> func<1>();
> func<2>();
> ```
> 
> When parsing the function body, the depth of the loop nest is unknown, hence we cannot OMPCanonicalLoops. They have to be inserted when Sema's `TemplateInstantiator` knows the value of `n`, but with you proposal, there are no OMPCanonicalLoops in the templated AST that `TransformOMPCanonicalLoops` would be called on.
> 
> I will am still willing to go with your proposal if you think this case should not be supported or that TemplateInstantiator` should override `TransformOMPExecutableDirective` to handle this case.
> 
> 
> > Understood, but I'm not aware of a case where TransformXYZ returns the transformation of the child and leaves it up to the caller to finish the transformation of XYZ itself.
> 
> This refers to an outdated version of this patch.
> I still think think we should we should handle this like TransformOMPExecutableDirective handles CapturedStmt, because (1) OMPExecutableDirective is in control of how its children is structured, (2) TransformOMPExecutableDirective  already does this this way

I agree that CapturedStmt is a precedent for the current patch's approach.  My struggle is to understand why that approach is relevant for OMPCanonicalLoop, as discussed below.

> (3) derived TreeTransform can change more aspects of a loop/loop-associated directive without having to override TransformOMPCanonicalLoop/TransformOMPExecutableDirective

If not those functions, what would the derived TreeTransform override to produce such changes?

> (4) fewer risk of derived TreeTransform to leave the AST in an inconsistent state and

I'm trying to imagine how this would happen.  I suppose there could be a transformation that removes an OMPExecutableDirective from its associated statement, on which it then calls TransformStmt, failing to notice that an OMPCanonicalLoop then needs to be removed too.  With my suggestion, TransformStmt would call TransformOMPCanonicalLoop and accidentally rebuild the OMPCanonicalLoop.  Without my suggestion, it would fail an assertion, indicating the transformation is broken.  Either way, it needs a special case for OMPCanonicalLoop.

OK, that seems like a benefit of the current patch over my suggestion.  I'm not sure whether it's worth it (TransformCapturedStmt doesn't have that benefit), but I'm fine with it.  If you prefer the current patch, could you please add a comment to TransformOMPCanonicalLoop to document why it's special?  The llvm_unreachable text alone doesn't make it clear to me.  Here's a long-winded suggestion:

"An OMPCanonicalLoop must appear only as a child of an OMPExecutableDirective that represents a loop directive.  To help maintain this invariant, TransformOMPCanonicalLoop always fails an assertion in order to force any transformation to handle OMPCanonicalLoop as a special case.  For example, the default TransformOMPExecutableDirective checks for an OMPCanonicalLoop and rebuilds it within the OMPExecutableDirective.  On the other hand, a transformation that removes an OMPExecutableDirective from its associated statement should also check for any OMPCanonicalLoop and remove it as well."

> (5) is will not work when the loop depth is dependent on a template.
> 
> To elaborate on (5):
> ```
> template<int n>
> void func() {
>    #pragma omp for collapse(n)
>    for (...
>        for (...
> }
> func<1>();
> func<2>();
> ```
> 
> When parsing the function body, the depth of the loop nest is unknown, hence we cannot OMPCanonicalLoops. They have to be inserted when Sema's `TemplateInstantiator` knows the value of `n`, but with you proposal, there are no OMPCanonicalLoops in the templated AST that `TransformOMPCanonicalLoops` would be called on.

In the current patch, OMPCanonicalLoop is created and TransformOMPExecutableDirective assumes it's present regardless of template parameters.  I guess you're saying that will change once you handle depth>1.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94973/new/

https://reviews.llvm.org/D94973



More information about the llvm-commits mailing list