[PATCH] D96854: [CodeExtractor] Enable partial aggregate arguments

Giorgis Georgakoudis via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 4 00:22:54 PST 2021


ggeorgakoudis added a comment.

In D96854#2594840 <https://reviews.llvm.org/D96854#2594840>, @vsk wrote:

> Sorry it's taken me so long to get to this.
>
>> partially aggregate inputs/outputs in their argument list
>
> Could you explain what this means, and what the pros/cons might be compared to any alternatives? It'd also help to see a test case.

Hi @vsk,

No problem! Let me make the use case of the OpenMP IR builder concrete, I'll do some simplications that do not affect the point. Currently, the OMPIRBuilder uses CodeExtractor to outline an OpenMP callback as:

  void omp.outlined(int global_tid, int bound_tid, int* arg0, int* arg1, ..., int* argn)

where `global_tid`, `bound_tid` are OpenMP runtime filled values passed to the outllined functions and `arg0`, `arg1`, ...,`argn` are inputs/outputs found by CodeExtractor. To implement parallel execution calling the outlined call funciton, OMPIRBuilder emits the call to the OpenMP runtime fork_join function, which use ellipsis to pass the variadic number of parameters to the OpenMP runtime:

  __kmpc_fork_call(int argc, omp.outlined, ..,)

so the ellipsis contains the arguments to the outlined function (`arg0`, `arg1`, ..., `argn`). The OpenMP runtime library fills the values for the preceding arguments `global_tid`, `bound_tid` when calling `omp.outlined` and forwards the rest of the arguments through a cumbersome dispatch function that unwraps the variadic arguments and uses a switch-case to call the function pointer of the callback as in:

  switch(argc) {
    case 1: fp_to_omp.outlined(global_tid, bound_tid, vararg[0]); return;
    case 2: fp_to_omp.outlined(global_tid, bound_tid, vararg[0], vararg[1]); return;
    ...
  }

We would like to remove this ellipsis interface because it creates various problems: there is a hardcoded limit on the number of arguments that the runtime forwards limited by the switch-case style unwrapping, it has been the source of ABI bugs, and makes hard to analyze and optimize OpenMP code in LLVM. For this we would like to aggregate the input/output arguments to the outlined function but leave the runtime-filled arguments unaggregated:

  void omp.outlined(int global_tid, int bound_tid, struct structArg)

This patch enables to //exclude// arguments from the arguments by extending `extractCodeRegion` in CodeExtractor with a parameter of which arguments to exclude (assuming AggregateArgs has been set when creating the CodeExtractor). In this specific use case for OpenMP, the exclude arguments are `global_tid` and `bound_tid`.

I have added a unit test that tests this functionality within that context. When we complete the change in OMPIRBuilder to use partial aggregation from CodeExtractor we will add also IR tests that will test this functionality.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96854/new/

https://reviews.llvm.org/D96854



More information about the llvm-commits mailing list