[PATCH] D155050: [AMDGPU] Wave32 CodeGen for amdgcn.ballot.i64

Wed Jul 12 12:59:46 PDT 2023

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/VOPCInstructions.td:1004
+      (i64 (AMDGPUsetcc vt:$src0, vt:$src1, cond)),
+      (i64 (REG_SEQUENCE SReg_64, (inst $src0, $src1), sub0, (i32 (IMPLICIT_DEF)), sub1))
+    >;
----------------
The high bits should be 0, not undef

================
Comment at: llvm/lib/Target/AMDGPU/VOPCInstructions.td:1079
+                                   DSTCLAMP.NONE), sub0,
+                                  (i32 (IMPLICIT_DEF)), sub1))
+    >;
----------------
0 high bits

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ballot.i64.wave32.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -march=amdgcn -mcpu=gfx1010 -mattr=+wavefrontsize32,-wavefrontsize64 < %s | FileCheck %s
+; RUN: llc -march=amdgcn -mcpu=gfx1100 -amdgpu-enable-delay-alu=0 -mattr=+wavefrontsize32,-wavefrontsize64 < %s | FileCheck %s
----------------
with and without global-isel. Also no reason to spell out both features, just rely on wave32 being the default. Also add a run line with wave32 on a wave64 target if it doesn't fail too horribly

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155050/new/

https://reviews.llvm.org/D155050