[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

Tue Aug 24 18:34:45 PDT 2021

sugak added subscribers: weiwang, sugak.
sugak added a comment.
Herald added a subscriber: sstefan1.

Hi @yaxunl! I'm working on upgrading a large codebase from LLVM-9 to LLVM-12. I noticed on average 10% compilation speed regression that seems to be caused this change. We use Clang modules and historically provide `-fopenmp` compiler flag by default. The problem seem to be that compiling and importing modules is now slower, with the generated modules size increased by 2X. llvm-bcanalyzer tool shows that it's dominated by `DECLS_TO_CHECK_FOR_DEFERRED_DIAGS`.  If I understand it right, your change is only relevant when target offloading is used. I inspected all of `#pragma omp` directives and can confirm that we don't use it.

I see that most of this code is gated by OpenMP flag. I wonder if there is a finer grain way to enable openmp parallel code generation without target offloading? Would it make sense to extend this code to check if `-fopenom-targets` is set before recording `DECLS_TO_CHECK_FOR_DEFERRED_DIAGS`?

Note, this was measured @weiwang's https://reviews.llvm.org/D101793.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70172/new/

https://reviews.llvm.org/D70172