[all-commits] [llvm/llvm-project] 9a1013: [Offload] Allow to record kernel launch stack trac...

Johannes Doerfert via All-commits all-commits at lists.llvm.org
Wed Jul 31 11:50:11 PDT 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9a1013220b668d846e63f241203b80515dee0a03
      https://github.com/llvm/llvm-project/commit/9a1013220b668d846e63f241203b80515dee0a03
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2024-07-31 (Wed, 31 Jul 2024)

  Changed paths:
    M offload/include/Shared/EnvironmentVar.h
    M offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa.h
    M offload/plugins-nextgen/amdgpu/src/rtl.cpp
    M offload/plugins-nextgen/common/include/ErrorReporting.h
    M offload/plugins-nextgen/common/include/PluginInterface.h
    M offload/plugins-nextgen/common/src/PluginInterface.cpp
    A offload/test/sanitizer/kernel_crash.c
    A offload/test/sanitizer/kernel_crash_async.c
    A offload/test/sanitizer/kernel_crash_many.c
    A offload/test/sanitizer/kernel_crash_single.c
    A offload/test/sanitizer/kernel_trap.c
    A offload/test/sanitizer/kernel_trap_async.c
    A offload/test/sanitizer/kernel_trap_many.c
    M openmp/docs/design/Runtimes.rst

  Log Message:
  -----------
  [Offload] Allow to record kernel launch stack traces (#100472)

Similar to (de)allocation traces, we can record kernel launch stack
traces and display them in case of an error. However, the AMD GPU plugin
signal handler, which is invoked on memroy faults, cannot pinpoint the
offending kernel. Insteade print `<NUM>`, set via
`OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=<NUM>`, many traces. The
recoding/record uses a ring buffer of fixed size (for now 8).
For `trap` errors, we print the actual kernel name, and trace if
recorded.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list