[all-commits] [llvm/llvm-project] 1c7721: AMDGPU: Add v_permlane16_swap_b32 and v_permlane32...

Fri Nov 22 20:10:40 PST 2024

  Branch: refs/heads/users/arsenm/gfx950/permlane-swap
  Home:   https://github.com/llvm/llvm-project
  Commit: 1c77212b8665ebf16b493ce7e79b7bc4ff67fecf
      https://github.com/llvm/llvm-project/commit/1c77212b8665ebf16b493ce7e79b7bc4ff67fecf
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2024-11-23 (Sat, 23 Nov 2024)

  Changed paths:
    M clang/include/clang/Basic/BuiltinsAMDGPU.def
    M clang/lib/CodeGen/CGBuiltin.cpp
    M clang/test/CodeGenOpenCL/amdgpu-features.cl
    M clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl
    M clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950.cl
    M clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950-param.cl
    M clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl
    M llvm/docs/AMDGPUUsage.rst
    M llvm/include/llvm/IR/IntrinsicsAMDGPU.td
    M llvm/lib/Target/AMDGPU/AMDGPU.td
    M llvm/lib/Target/AMDGPU/AMDGPUGISel.td
    M llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h
    M llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
    M llvm/lib/Target/AMDGPU/GCNSubtarget.h
    M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
    M llvm/lib/Target/AMDGPU/SIInstrInfo.td
    M llvm/lib/Target/AMDGPU/VOP1Instructions.td
    M llvm/lib/Target/AMDGPU/VOPInstructions.td
    M llvm/lib/TargetParser/TargetParser.cpp
    M llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll
    A llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane16.swap.ll
    A llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane32.swap.ll
    M llvm/test/MC/AMDGPU/gfx950_asm_features.s
    A llvm/test/MC/AMDGPU/gfx950_err.s
    M llvm/test/MC/Disassembler/AMDGPU/gfx950.txt

  Log Message:
  -----------
  AMDGPU: Add v_permlane16_swap_b32 and v_permlane32_swap_b32 for gfx950

This was a bit annoying because these introduce a new special case
encoding usage. op_sel is repurposed as a subset of dpp controls,
and is eligible for VOP3->VOP1 shrinking. For some reason fi also
uses an enum value, so we need to convert the raw boolean to 1 instead
of -1.

The 2 registers are swapped, so this has 2 defs. Ideally the builtin
would return a pair, but that's difficult so return a vector instead.
This would make a hypothetical builtin that supports v2f16 directly
uglier.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications