[all-commits] [llvm/llvm-project] 1c7721: AMDGPU: Add v_permlane16_swap_b32 and v_permlane32...
Matt Arsenault via All-commits
all-commits at lists.llvm.org
Fri Nov 22 20:10:40 PST 2024
Branch: refs/heads/users/arsenm/gfx950/permlane-swap
Home: https://github.com/llvm/llvm-project
Commit: 1c77212b8665ebf16b493ce7e79b7bc4ff67fecf
https://github.com/llvm/llvm-project/commit/1c77212b8665ebf16b493ce7e79b7bc4ff67fecf
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2024-11-23 (Sat, 23 Nov 2024)
Changed paths:
M clang/include/clang/Basic/BuiltinsAMDGPU.def
M clang/lib/CodeGen/CGBuiltin.cpp
M clang/test/CodeGenOpenCL/amdgpu-features.cl
M clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl
M clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950.cl
M clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950-param.cl
M clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl
M llvm/docs/AMDGPUUsage.rst
M llvm/include/llvm/IR/IntrinsicsAMDGPU.td
M llvm/lib/Target/AMDGPU/AMDGPU.td
M llvm/lib/Target/AMDGPU/AMDGPUGISel.td
M llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h
M llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
M llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
M llvm/lib/Target/AMDGPU/GCNSubtarget.h
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
M llvm/lib/Target/AMDGPU/SIInstrInfo.td
M llvm/lib/Target/AMDGPU/VOP1Instructions.td
M llvm/lib/Target/AMDGPU/VOPInstructions.td
M llvm/lib/TargetParser/TargetParser.cpp
M llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll
A llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane16.swap.ll
A llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane32.swap.ll
M llvm/test/MC/AMDGPU/gfx950_asm_features.s
A llvm/test/MC/AMDGPU/gfx950_err.s
M llvm/test/MC/Disassembler/AMDGPU/gfx950.txt
Log Message:
-----------
AMDGPU: Add v_permlane16_swap_b32 and v_permlane32_swap_b32 for gfx950
This was a bit annoying because these introduce a new special case
encoding usage. op_sel is repurposed as a subset of dpp controls,
and is eligible for VOP3->VOP1 shrinking. For some reason fi also
uses an enum value, so we need to convert the raw boolean to 1 instead
of -1.
The 2 registers are swapped, so this has 2 defs. Ideally the builtin
would return a pair, but that's difficult so return a vector instead.
This would make a hypothetical builtin that supports v2f16 directly
uglier.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list