[all-commits] [llvm/llvm-project] f4e614: [AMDGPU] Use V_PERM to match buildvectors when inp...

Mon Oct 3 12:59:04 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f4e6149d8217176f71591b277e9cd08be5f732c1
      https://github.com/llvm/llvm-project/commit/f4e6149d8217176f71591b277e9cd08be5f732c1
  Author: jeff <Jeffrey.Byrnes at amd.com>
  Date:   2022-10-03 (Mon, 03 Oct 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
    M llvm/lib/Target/AMDGPU/SIInstructions.td
    M llvm/lib/Target/AMDGPU/SOPInstructions.td
    M llvm/test/CodeGen/AMDGPU/GlobalISel/fpow.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.a16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.gather4.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.2darraymsaa.a16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.3d.a16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.sample.cd.g16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.sample.g16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/uaddsat.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/usubsat.ll
    M llvm/test/CodeGen/AMDGPU/add.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/build-vector-packed-partial-undef.ll
    M llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
    M llvm/test/CodeGen/AMDGPU/combine-vload-extract.ll
    M llvm/test/CodeGen/AMDGPU/divergence-driven-buildvector.ll
    M llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll
    M llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll
    M llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll
    M llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll
    M llvm/test/CodeGen/AMDGPU/fmax_legacy.f16.ll
    M llvm/test/CodeGen/AMDGPU/fmin_legacy.f16.ll
    M llvm/test/CodeGen/AMDGPU/fshr.ll
    M llvm/test/CodeGen/AMDGPU/idot4s.ll
    M llvm/test/CodeGen/AMDGPU/idot4u.ll
    M llvm/test/CodeGen/AMDGPU/idot8s.ll
    M llvm/test/CodeGen/AMDGPU/idot8u.ll
    M llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.encode.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.encode.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.ll
    M llvm/test/CodeGen/AMDGPU/load-hi16.ll
    M llvm/test/CodeGen/AMDGPU/load-lo16.ll
    M llvm/test/CodeGen/AMDGPU/pack.v2f16.ll
    M llvm/test/CodeGen/AMDGPU/pack.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/partial-shift-shrink.ll
    M llvm/test/CodeGen/AMDGPU/strict_fadd.f16.ll
    M llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll
    M llvm/test/CodeGen/AMDGPU/strict_fmul.f16.ll
    M llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll
    M llvm/test/CodeGen/AMDGPU/sub.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll

  Log Message:
  -----------
  [AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)

If we can not prove that f16 operands of a buildvector are canonicalized, then we can not lower into a V_PACK. In this scenario, we would previously lower into some combination of and(sdwa), shr, or. This patch allows for matching into V_PERM instead.

Change-Id: Ifa4a74fdb81ef44f22ba490c7fdf81ec8aebc945