[all-commits] [llvm/llvm-project] 43fd4c: [mlir][GPU] Improve handling of GPU bounds (#95166)

Mon Jun 17 21:47:59 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 43fd4c49bd8d54b9058620f0a885c7a5672fd602
      https://github.com/llvm/llvm-project/commit/43fd4c49bd8d54b9058620f0a885c7a5672fd602
  Author: Krzysztof Drewniak <Krzysztof.Drewniak at amd.com>
  Date:   2024-06-17 (Mon, 17 Jun 2024)

  Changed paths:
    M mlir/include/mlir/Dialect/GPU/IR/GPUBase.td
    M mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
    M mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
    M mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h
    M mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
    M mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
    M mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
    M mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
    M mlir/lib/Dialect/GPU/IR/InferIntRangeInterfaceImpls.cpp
    M mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
    M mlir/test/Conversion/GPUToNVVM/gpu-to-nvvm.mlir
    M mlir/test/Conversion/GPUToROCDL/gpu-to-rocdl.mlir
    M mlir/test/Dialect/GPU/int-range-interface.mlir
    M mlir/test/Dialect/GPU/invalid.mlir
    M mlir/test/Dialect/GPU/outlining.mlir

  Log Message:
  -----------
  [mlir][GPU] Improve handling of GPU bounds (#95166)

This change reworks how range information for GPU dispatch IDs (block
IDs, thread IDs, and so on) is handled.

1. `known_block_size` and `known_grid_size` become inherent attributes
of GPU functions. This makes them less clunky to work with. As a
consequence, the `gpu.func` lowering patterns now only look at the
inherent attributes when setting target-specific attributes on the
`llvm.func` that they lower to.
2. At the same time, `gpu.known_block_size` and `gpu.known_grid_size`
are made official dialect-level discardable attributes which can be
placed on arbitrary functions. This allows for progressive lowerings
(without this, a lowering for `gpu.thread_id` couldn't know about the
bounds if it had already been moved from a `gpu.func` to an `llvm.func`)
and allows for range information to be provided even when
`gpu.*_{id,dim}` are being used outside of a `gpu.func` context.
3. All of these index operations have gained an optional `upper_bound`
attribute, allowing for an alternate mode of operation where the bounds
are specified locally and not inherited from the operation's context.
These also allow handling of cases where the precise launch sizes aren't
known, but can be bounded more precisely than the maximum of what any
platform's API allows. (I'd like to thank @benvanik for pointing out
that this could be useful.)

When inferring bounds (either for range inference or for setting `range`
during lowering) these sources of information are consulted in order of
specificity (`upper_bound` > inherent attribute > discardable attribute,
except that dimension sizes check for `known_*_bounds` to see if they
can be constant-folded before checking their `upper_bound`).

This patch also updates the documentation about the bounds and inference
behavior to clarify what these attributes do when set and the
consequences of setting them up incorrectly.

---------

Co-authored-by: Mehdi Amini <joker.eph at gmail.com>

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications