[all-commits] [llvm/llvm-project] ea43a3: [AMDGPU] Vectorize more 16 bit shuffles (#90648)

Tue May 21 09:21:58 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: ea43a30899df5c3c36412392c8f4db79973a1c43
      https://github.com/llvm/llvm-project/commit/ea43a30899df5c3c36412392c8f4db79973a1c43
  Author: Jeffrey Byrnes <jeffrey.byrnes at amd.com>
  Date:   2024-05-21 (Tue, 21 May 2024)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/AMDGPU/shufflevector.ll
    M llvm/test/Transforms/SLPVectorizer/AMDGPU/add_sub_sat-inseltpoison.ll
    M llvm/test/Transforms/SLPVectorizer/AMDGPU/add_sub_sat.ll
    M llvm/test/Transforms/SLPVectorizer/AMDGPU/crash_extract_subvector_cost.ll
    M llvm/test/Transforms/SLPVectorizer/AMDGPU/phi-result-use-order.ll
    M llvm/test/Transforms/SLPVectorizer/AMDGPU/reduction.ll

  Log Message:
  -----------
  [AMDGPU] Vectorize more 16 bit shuffles (#90648)

In the case of larger vectors, we should still prefer the vectorized
version (i.e. shufflevector vs extract/insert chains).

In arithmetic chains, vectorization results in chains of packed math
instructions (as opposed to unpack/repack & scalarized arithmetic):
https://godbolt.org/z/c5onaf6G5

In chains with PHIs, vectorization again removes the unnecessary pack /
repack code around BBs: https://godbolt.org/z/vz7zYzvhs

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications