[llvm-branch-commits] [llvm] [Frontend][OpenMP] Refactor getLeafConstructs, add getCompoundConstruct (PR #87247)

Tue Apr 2 05:34:10 PDT 2024

https://github.com/skatrak commented:

I think this looks reasonable if we want to both split a construct into leaf constructs and also merge multiple constructs into a single compound one, if it exists. I'm just wondering if we need that much flexibility (and complexity associated with it).

It looks to me that there are mainly a couple of things that we want to be able to do:
- Split any compound directive into a list of leaves, so that clauses can be appropriately assigned to each.
- Split combined directives while preserving composite constructs as a unit, useful during lowering.

This approach works well for the first case and the second case can be made to work by first processing composite and standalone constructs, then splitting combined directives into the first leaf and the rest, which we would combine. Then processing that "rest of the construct" by processing it the same way and splitting it iteratively until everything has been processed. This has the disadvantage of being costly in comparison, since combining leaf constructs involves a binary search within the leaf table.

I'm wondering if there's a better abstraction we could use in tablegen that would work for both. Specifically, I'm thinking whether this makes sense to do:
- Non-compound constructs remain as they are.
- Rename the `leafConstructs` list of records to `composite`, and only use it for composite constructs.
- Add an optional `combined` pair of records that defines exclusively for combined constructs what the outer/parent construct is and what the "rest" is.
- A directive cannot define both `composite` and `combined` simultaneously.

An example of how that would look in tablegen could be:
```tablegen
def OMP_DoSimd : Directive<"do simd"> {
  let allowedClauses = [...];
  // 2+ elements: both leaf constructs.
  let composite = [OMP_Do, OMP_Simd];
}
def OMP_ParallelDoSimd : Directive<"parallel do simd"> {
  let allowedClauses = [...];
  // Always 2 elements: the first is a leaf and the second can be compound.
  let combined = [OMP_Parallel, OMP_DoSimd];
}
def OMP_TargetParallelDoSimd : Directive<"target parallel do simd"> {
  let allowedClauses = [...];
  // Always 2 elements: the first is a leaf and the second can be compound.
  let combined = [OMP_Target, OMP_ParallelDoSimd];
}
```

Based on that alternative representation, I think it should be easier to implement both types of splitting without having to allow re-combining leaf constructs back into compound ones. Another advantage is that the representation is explicit in the separation between combined and composite directives. However, it doesn't make it any less cumbersome to add the proposed list of new combined constructs.

This is just an idea, which may have some big disadvantages I haven't thought about. I won't block the current proposal, but I prefer if others more involved in the definition of the OMP.td format gave this a look before merging as well.

https://github.com/llvm/llvm-project/pull/87247