[clang] [OpenMP] Allow GPUs to be targeted directly via `-fopenmp`. (PR #122149)

Thu Jan 9 09:33:26 PST 2025

shiltian wrote:

> the target regions are just outlined, so it shouldn't affect anything on a codegen level.

No, they are not. The standard defines the execution behavior and codegen has to conform with it. The current GPU CodeGen in this discussion assumes it is generating for constructs _inside_ a `target` region (because almost all the other OpenMP constructs are skipped), which is the most essential question here. I have no problem with treating a GPU target as a _host_, but it has to conform with _host_ execution, aka running outside of a `target` region. If the standard says all constructs have to behave the same if they are wrapped into `declare target` (as well as its friends), I'm totally fine with it.

```
// host
#pragma omp some_construct
{ /* some code */ }

// "offload"
#pragma omp declare target
#pragma omp same_construct_as_above
{ /* same code as above */ }
```

This patch basically wraps the entire TU into a giant `declare target` implicitly when it compiles (or specifically code gens) for a GPU, as shown in the 2nd code block above. I guess this might be a question to the language committee.

https://github.com/llvm/llvm-project/pull/122149