[all-commits] [llvm/llvm-project] 20d839: [AMDGPU] ISel & PEI for whole wave functions (#145...

Diana Picus via All-commits all-commits at lists.llvm.org
Mon Jul 21 01:39:31 PDT 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 20d8398825a799008ae508d8463dbb9b11df81e7
      https://github.com/llvm/llvm-project/commit/20d8398825a799008ae508d8463dbb9b11df81e7
  Author: Diana Picus <Diana-Magda.Picus at amd.com>
  Date:   2025-07-21 (Mon, 21 Jul 2025)

  Changed paths:
    M llvm/docs/AMDGPUUsage.rst
    M llvm/include/llvm/AsmParser/LLToken.h
    M llvm/include/llvm/IR/CallingConv.h
    M llvm/lib/AsmParser/LLLexer.cpp
    M llvm/lib/AsmParser/LLParser.cpp
    M llvm/lib/IR/AsmWriter.cpp
    M llvm/lib/IR/Function.cpp
    M llvm/lib/IR/Verifier.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUCallLowering.h
    M llvm/lib/Target/AMDGPU/AMDGPUGISel.td
    M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
    M llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
    M llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
    M llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
    M llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
    M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
    M llvm/lib/Target/AMDGPU/SIInstrInfo.h
    M llvm/lib/Target/AMDGPU/SIInstructions.td
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
    M llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
    M llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
    M llvm/lib/Target/AMDGPU/Utils/AMDGPUPALMetadata.cpp
    M llvm/test/Bitcode/compatibility.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-whole-wave-functions.mir
    A llvm/test/CodeGen/AMDGPU/irtranslator-whole-wave-functions.ll
    A llvm/test/CodeGen/AMDGPU/isel-whole-wave-functions.ll
    A llvm/test/CodeGen/AMDGPU/whole-wave-functions-pei.mir
    A llvm/test/CodeGen/AMDGPU/whole-wave-functions.ll
    M llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info.ll
    M llvm/test/Verifier/amdgpu-cc.ll

  Log Message:
  -----------
  [AMDGPU] ISel & PEI for whole wave functions (#145858)

Whole wave functions are functions that will run with a full EXEC mask.
They will not be invoked directly, but instead will be launched by way
of a new intrinsic, `llvm.amdgcn.call.whole.wave` (to be added in
a future patch). These functions are meant as an alternative to the
`llvm.amdgcn.init.whole.wave` or `llvm.amdgcn.strict.wwm` intrinsics.

Whole wave functions will set EXEC to -1 in the prologue and restore the
original value of EXEC in the epilogue. They must have a special first
argument, `i1 %active`, that is going to be mapped to EXEC. They may
have either the default calling convention or amdgpu_gfx. The inactive
lanes need to be preserved for all registers used, active lanes only for
the CSRs.

At the IR level, arguments to a whole wave function (other than
`%active`) contain poison in their inactive lanes. Likewise, the return
value for the inactive lanes is poison.

This patch contains the following work:
* 2 new pseudos, SI_SETUP_WHOLE_WAVE_FUNC and SI_WHOLE_WAVE_FUNC_RETURN
  used for managing the EXEC mask. SI_SETUP_WHOLE_WAVE_FUNC will return
  a SReg_1 representing `%active`, which needs to be passed into
  SI_WHOLE_WAVE_FUNC_RETURN.
* SelectionDAG support for generating these 2 new pseudos and the
  special handling of %active. Since the return may be in a different
  basic block, it's difficult to add the virtual reg for %active to
  SI_WHOLE_WAVE_FUNC_RETURN, so we initially generate an IMPLICIT_DEF
  which is later replaced via a custom inserter.
* Expansion of the 2 pseudos during prolog/epilog insertion. PEI also
  marks any used VGPRs as WWM registers, which are then spilled and
  restored with the usual logic.

Future patches will include the `llvm.amdgcn.call.whole.wave` intrinsic
and a lot of optimization work (especially in order to reduce spills
around function calls).

---------

Co-authored-by: Matt Arsenault <Matthew.Arsenault at amd.com>
Co-authored-by: Shilei Tian <i at tianshilei.me>



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list