[all-commits] [llvm/llvm-project] 189609: [mlir][ROCM] Add Wave/Warp shuffle lowering and op...

Thu Aug 24 17:37:04 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 1896096002b75b50d46ee0043c20e90c7e27604a
      https://github.com/llvm/llvm-project/commit/1896096002b75b50d46ee0043c20e90c7e27604a
  Author: Stanley Winata <stanley at nod-labs.com>
  Date:   2023-08-24 (Thu, 24 Aug 2023)

  Changed paths:
    M mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
    M mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
    M mlir/test/Conversion/GPUToROCDL/gpu-to-rocdl.mlir
    R mlir/test/Conversion/GPUToROCDL/invalid.mlir
    M mlir/test/Target/LLVMIR/rocdl.mlir

  Log Message:
  -----------
  [mlir][ROCM] Add Wave/Warp shuffle lowering and op for ROCM.

Reduction is heavily used for many DL workload especially with
softmax/Attention layers. Wave/Warp shuffle and reduction is known to be
a speedy/efficient way to do these reductions.

In this patch we introduce AMD shuffle intrinsic Ops to ROCDL, along with it's corresponding lowering from gpu.shuffle. This should speed up a lot of DL workloads on ROCM backend. Currently, we have support for xor and idx, which are the more common ones. In the future, we plan on adding support for Down and Up, as well as using the ds_swizzle to further enhance it's performance when width and offsets are constant.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D158684