[PATCH] D134463: [AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)

Thu Sep 22 12:21:16 PDT 2022

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2609-2610
 def : GCNPat <
   (v2f16 (DivergentBinFrag<build_vector> (f16 SReg_32:$src0), (f16 SReg_32:$src1))),
   (v2f16 (V_LSHL_OR_B32_e64 SReg_32:$src1, (i32 16), (i32 (V_AND_B32_e64 (i32 (V_MOV_B32_e32 (i32 0xffff))), SReg_32:$src0))))
 >;
----------------
Should we just replace this?

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2615
+def : GCNPat <
+  (i32 (bitconvert (v2f16 (DivergentBinFrag<build_vector> (f16 (bitconvert (i16 (trunc VGPR_32:$a)))), (f16 (bitconvert (i16 (trunc VGPR_32:$b)))))))),
+  (V_PERM_B32_e64 VGPR_32:$b, VGPR_32:$a, (S_MOV_B32 (i32 0x05040100)))
----------------
Also should cover the integer cases?

================
Comment at: llvm/test/CodeGen/AMDGPU/pack.v2f16.ll:1
-; RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx900 -mattr=-flat-for-global -denormal-fp-math=preserve-sign -verify-machineinstrs < %s | FileCheck --check-prefixes=GCN,GFX9 %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx900 -mattr=-flat-for-global -denormal-fp-math=preserve-sign -verify-machineinstrs < %s | FileCheck --check-prefixes=GFX9 %s
----------------
Switching to generated checks should be a separate pre-commit

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134463/new/

https://reviews.llvm.org/D134463