[all-commits] [llvm/llvm-project] f4e614: [AMDGPU] Use V_PERM to match buildvectors when inp...
Jeffrey Byrnes via All-commits
all-commits at lists.llvm.org
Mon Oct 3 12:59:04 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: f4e6149d8217176f71591b277e9cd08be5f732c1
https://github.com/llvm/llvm-project/commit/f4e6149d8217176f71591b277e9cd08be5f732c1
Author: jeff <Jeffrey.Byrnes at amd.com>
Date: 2022-10-03 (Mon, 03 Oct 2022)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
M llvm/lib/Target/AMDGPU/SIInstructions.td
M llvm/lib/Target/AMDGPU/SOPInstructions.td
M llvm/test/CodeGen/AMDGPU/GlobalISel/fpow.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.a16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.gather4.a16.dim.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.2darraymsaa.a16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.3d.a16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.sample.cd.g16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.sample.g16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/uaddsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/usubsat.ll
M llvm/test/CodeGen/AMDGPU/add.v2i16.ll
M llvm/test/CodeGen/AMDGPU/build-vector-packed-partial-undef.ll
M llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
M llvm/test/CodeGen/AMDGPU/combine-vload-extract.ll
M llvm/test/CodeGen/AMDGPU/divergence-driven-buildvector.ll
M llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll
M llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll
M llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll
M llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll
M llvm/test/CodeGen/AMDGPU/fmax_legacy.f16.ll
M llvm/test/CodeGen/AMDGPU/fmin_legacy.f16.ll
M llvm/test/CodeGen/AMDGPU/fshr.ll
M llvm/test/CodeGen/AMDGPU/idot4s.ll
M llvm/test/CodeGen/AMDGPU/idot4u.ll
M llvm/test/CodeGen/AMDGPU/idot8s.ll
M llvm/test/CodeGen/AMDGPU/idot8u.ll
M llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.a16.dim.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.a16.dim.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.encode.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.a16.dim.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.encode.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.ll
M llvm/test/CodeGen/AMDGPU/load-hi16.ll
M llvm/test/CodeGen/AMDGPU/load-lo16.ll
M llvm/test/CodeGen/AMDGPU/pack.v2f16.ll
M llvm/test/CodeGen/AMDGPU/pack.v2i16.ll
M llvm/test/CodeGen/AMDGPU/partial-shift-shrink.ll
M llvm/test/CodeGen/AMDGPU/strict_fadd.f16.ll
M llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll
M llvm/test/CodeGen/AMDGPU/strict_fmul.f16.ll
M llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll
M llvm/test/CodeGen/AMDGPU/sub.v2i16.ll
M llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll
Log Message:
-----------
[AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)
If we can not prove that f16 operands of a buildvector are canonicalized, then we can not lower into a V_PACK. In this scenario, we would previously lower into some combination of and(sdwa), shr, or. This patch allows for matching into V_PERM instead.
Change-Id: Ifa4a74fdb81ef44f22ba490c7fdf81ec8aebc945
More information about the All-commits
mailing list