[all-commits] [llvm/llvm-project] 6cc7ca: [Clang] Fix cross-lane scan when given divergent l...

Joseph Huber via All-commits all-commits at lists.llvm.org
Wed Feb 19 14:47:22 PST 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 6cc7ca084a5bbb7ccf606cab12065604453dde59
      https://github.com/llvm/llvm-project/commit/6cc7ca084a5bbb7ccf606cab12065604453dde59
  Author: Joseph Huber <huberjn at outlook.com>
  Date:   2025-02-19 (Wed, 19 Feb 2025)

  Changed paths:
    M clang/lib/Headers/gpuintrin.h
    M clang/lib/Headers/nvptxintrin.h
    M libc/test/integration/src/__support/GPU/scan_reduce.cpp

  Log Message:
  -----------
  [Clang] Fix cross-lane scan when given divergent lanes (#127703)

Summary:
The scan operation implemented here only works if there are contiguous
ones in the executation mask that can be used to propagate the result.
There are two solutions to this, one is to enter 'whole-wave-mode' and
forcibly turn them back on, or to do this serially. This implementation
does the latter because it's more portable, but checks to see if the
parallel fast-path is applicable.

Needs to be backported for correct behavior and because it fixes a
failing libc test.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list