[PATCH] D154858: [WIP] [AMDGPU] Add llvm.amdgcn.wave.reduce.umin/umax Intrinsic.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 12 07:51:04 PDT 2023


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:4089
+        ScanStratgyImm == 0 ? ScanOptions::DPP : ScanOptions::Iterative;
+    if (ScanStrategy == ScanOptions::Iterative) {
+      // To reduce the VGPR, we need to iterative over all the active lanes.
----------------
I was envisioning this as just a hint, and if unimplemented (or the target doesn't support the version), it would just fallback to one that works.

Should also add some intrinsic documentation to AMDGPUUsage with the values for this


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:267
+
+  def WAVE_REDUCE_UMAX_PSEUDO : VPseudoInstSI <(outs SGPR_32:$sdst),
+    (ins VSrc_b32: $src, VSrc_b32:$strategy),
----------------
These need _U32/_B32 suffixes


================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.umax.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
+; RUN: llc -march=amdgcn -mcpu=gfx1100 -global-isel=0 -mattr=+wavefrontsize32,-wavefrontsize64 < %s | FileCheck %s
+
----------------
Should test with both wave sizes, and test for every generation, with global-isel=0 and 1


================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.umin.ll:4
+
+declare i32 @llvm.amdgcn.wave.reduce.umin(i32, i32)
+declare i32 @llvm.amdgcn.workitem.id.x()
----------------
Put the immarg on the declarations 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154858/new/

https://reviews.llvm.org/D154858



More information about the llvm-commits mailing list