[Lldb-commits] [lldb] [libc] [flang] [openmp] [libcxx] [compiler-rt] [llvm] [clang-tools-extra] [clang] [lld] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)
    Johannes Doerfert via lldb-commits 
    lldb-commits at lists.llvm.org
       
    Fri Jan  5 10:45:06 PST 2024
    
    
  
jdoerfert wrote:
> > ongoing effort to extends PGO instrumentation to GPU device code
> 
> Is there a high level description for this effort and its goal? Traditional compiler PGO is mostly for profiling control-flow, but we don't usually have a lot of control flow for GPU kernels.
I am unsure where this assumption comes from but it is not true in general.
HPC most certainly has lots of control flow in kernels. We also have lots of calls, indirect calls, loops of every size, and basically everything else you have in CPU code.
Thus, high level, we want to have PGO for offloaded code for the same reasons we want PGO for non-offloaded codes.
https://github.com/llvm/llvm-project/pull/76587
    
    
More information about the lldb-commits
mailing list