[all-commits] [llvm/llvm-project] 20d839: [AMDGPU] ISel & PEI for whole wave functions (#145...
Diana Picus via All-commits
all-commits at lists.llvm.org
Mon Jul 21 01:39:31 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 20d8398825a799008ae508d8463dbb9b11df81e7
https://github.com/llvm/llvm-project/commit/20d8398825a799008ae508d8463dbb9b11df81e7
Author: Diana Picus <Diana-Magda.Picus at amd.com>
Date: 2025-07-21 (Mon, 21 Jul 2025)
Changed paths:
M llvm/docs/AMDGPUUsage.rst
M llvm/include/llvm/AsmParser/LLToken.h
M llvm/include/llvm/IR/CallingConv.h
M llvm/lib/AsmParser/LLLexer.cpp
M llvm/lib/AsmParser/LLParser.cpp
M llvm/lib/IR/AsmWriter.cpp
M llvm/lib/IR/Function.cpp
M llvm/lib/IR/Verifier.cpp
M llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
M llvm/lib/Target/AMDGPU/AMDGPUCallLowering.h
M llvm/lib/Target/AMDGPU/AMDGPUGISel.td
M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
M llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td
M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
M llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
M llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
M llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
M llvm/lib/Target/AMDGPU/SIInstrInfo.h
M llvm/lib/Target/AMDGPU/SIInstructions.td
M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
M llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
M llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
M llvm/lib/Target/AMDGPU/Utils/AMDGPUPALMetadata.cpp
M llvm/test/Bitcode/compatibility.ll
A llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-whole-wave-functions.mir
A llvm/test/CodeGen/AMDGPU/irtranslator-whole-wave-functions.ll
A llvm/test/CodeGen/AMDGPU/isel-whole-wave-functions.ll
A llvm/test/CodeGen/AMDGPU/whole-wave-functions-pei.mir
A llvm/test/CodeGen/AMDGPU/whole-wave-functions.ll
M llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll
M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll
M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll
M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll
M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir
M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info.ll
M llvm/test/Verifier/amdgpu-cc.ll
Log Message:
-----------
[AMDGPU] ISel & PEI for whole wave functions (#145858)
Whole wave functions are functions that will run with a full EXEC mask.
They will not be invoked directly, but instead will be launched by way
of a new intrinsic, `llvm.amdgcn.call.whole.wave` (to be added in
a future patch). These functions are meant as an alternative to the
`llvm.amdgcn.init.whole.wave` or `llvm.amdgcn.strict.wwm` intrinsics.
Whole wave functions will set EXEC to -1 in the prologue and restore the
original value of EXEC in the epilogue. They must have a special first
argument, `i1 %active`, that is going to be mapped to EXEC. They may
have either the default calling convention or amdgpu_gfx. The inactive
lanes need to be preserved for all registers used, active lanes only for
the CSRs.
At the IR level, arguments to a whole wave function (other than
`%active`) contain poison in their inactive lanes. Likewise, the return
value for the inactive lanes is poison.
This patch contains the following work:
* 2 new pseudos, SI_SETUP_WHOLE_WAVE_FUNC and SI_WHOLE_WAVE_FUNC_RETURN
used for managing the EXEC mask. SI_SETUP_WHOLE_WAVE_FUNC will return
a SReg_1 representing `%active`, which needs to be passed into
SI_WHOLE_WAVE_FUNC_RETURN.
* SelectionDAG support for generating these 2 new pseudos and the
special handling of %active. Since the return may be in a different
basic block, it's difficult to add the virtual reg for %active to
SI_WHOLE_WAVE_FUNC_RETURN, so we initially generate an IMPLICIT_DEF
which is later replaced via a custom inserter.
* Expansion of the 2 pseudos during prolog/epilog insertion. PEI also
marks any used VGPRs as WWM registers, which are then spilled and
restored with the usual logic.
Future patches will include the `llvm.amdgcn.call.whole.wave` intrinsic
and a lot of optimization work (especially in order to reduce spills
around function calls).
---------
Co-authored-by: Matt Arsenault <Matthew.Arsenault at amd.com>
Co-authored-by: Shilei Tian <i at tianshilei.me>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list