[PATCH] D148142: [AMDGPU] Update Subtarget isWave32 method to ignore the wave32 feature pre-gfx9

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 6 14:15:06 PDT 2023


arsenm requested changes to this revision.
arsenm added a comment.
This revision now requires changes to proceed.

You can achieve this in the subtarget constructor by setting the wavefrontsize member. This would also be a strategy change vs. the AMDGPURemoveIncompatibleFunctions and doesn't belong in a patch to fix the saveexec combine



================
Comment at: llvm/test/CodeGen/AMDGPU/saveexec-xor-optimize.mir:2
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
+# RUN: llc -march=amdgcn -mcpu=gfx1010 -mattr=+wavefrontsize32 -run-pass=si-optimize-exec-masking -verify-machineinstrs %s -o - | FileCheck --check-prefixes=GFX10 %s
+# RUN: llc -march=amdgcn -mcpu=gfx90a -mattr=+wavefrontsize64 -run-pass=si-optimize-exec-masking -verify-machineinstrs %s -o - | FileCheck --check-prefixes=GFX9 %s
----------------
The exec masking issue is not solved by this. The pass itself needs to check whether the 32-bit saveexec operations are legal (i.e. gfx10). It's not directly broken because of wave32


================
Comment at: llvm/test/CodeGen/AMDGPU/saveexec-xor-optimize.mir:21
+    ; GFX9-NEXT: $exec_lo = S_XOR_B32 $exec_lo, renamable $sgpr0, implicit-def $scc
+    renamable $sgpr0 = S_OR_SAVEEXEC_B32 killed renamable $sgpr0, implicit-def $exec, implicit-def $scc, implicit $exec
+    $exec_lo = S_XOR_B32 $exec_lo, renamable $sgpr0, implicit-def $scc
----------------
the saveexec isn't supposed to exist in the input 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148142/new/

https://reviews.llvm.org/D148142



More information about the llvm-commits mailing list