[all-commits] [llvm/llvm-project] 30bb11: [AMDGPU][NFC] Refactor emitEntryFunctionPrologue

Scott Linder via All-commits all-commits at lists.llvm.org
Thu Mar 19 12:40:42 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 30bb113beb321063a3224c1a48f4f2c22dc04af1
      https://github.com/llvm/llvm-project/commit/30bb113beb321063a3224c1a48f4f2c22dc04af1
  Author: Scott Linder <Scott.Linder at amd.com>
  Date:   2020-03-19 (Thu, 19 Mar 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
    M llvm/lib/Target/AMDGPU/SIFrameLowering.h

  Log Message:
  -----------
  [AMDGPU][NFC] Refactor emitEntryFunctionPrologue

Remove dead code and factor repeated conditions out into a single check.
Rename and move code to make it more obvious what is running only for
entry functions. Simplify function arguments to make it clearer what the
relevant inputs are. Make flat scratch init accept an MBB iterator and
move it to where it was logically being emitted within the prologue.

These changes will make a future update to the calling convention
simpler.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75092


  Commit: db099f994b5fb14209e29487b87bc2be54b3725d
      https://github.com/llvm/llvm-project/commit/db099f994b5fb14209e29487b87bc2be54b3725d
  Author: Scott Linder <Scott.Linder at amd.com>
  Date:   2020-03-19 (Thu, 19 Mar 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h

  Log Message:
  -----------
  [AMDGPU][NFC] Refactor some uses of unsigned to Register

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76035


  Commit: 60b1967c3933c42f473a3f7be5f40747547b6057
      https://github.com/llvm/llvm-project/commit/60b1967c3933c42f473a3f7be5f40747547b6057
  Author: Scott Linder <Scott.Linder at amd.com>
  Date:   2020-03-19 (Thu, 19 Mar 2020)

  Changed paths:
    M llvm/docs/AMDGPUUsage.rst
    M llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
    M llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
    M llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
    M llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
    M llvm/lib/Target/AMDGPU/SIFrameLowering.h
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
    M llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
    M llvm/lib/Target/AMDGPU/SIRegisterInfo.h
    M llvm/lib/Target/AMDGPU/SIRegisterInfo.td
    M llvm/test/CodeGen/AMDGPU/GlobalISel/divergent-control-flow.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll
    M llvm/test/CodeGen/AMDGPU/addrspacecast.ll
    M llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll
    M llvm/test/CodeGen/AMDGPU/amdhsa-trap-num-sgprs.ll
    M llvm/test/CodeGen/AMDGPU/array-ptr-calc-i32.ll
    M llvm/test/CodeGen/AMDGPU/attr-amdgpu-num-sgpr.ll
    M llvm/test/CodeGen/AMDGPU/byval-frame-setup.ll
    M llvm/test/CodeGen/AMDGPU/call-argument-types.ll
    M llvm/test/CodeGen/AMDGPU/call-constant.ll
    M llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
    M llvm/test/CodeGen/AMDGPU/call-waitcnt.ll
    M llvm/test/CodeGen/AMDGPU/callee-special-input-sgprs-fixed-abi.ll
    M llvm/test/CodeGen/AMDGPU/callee-special-input-sgprs.ll
    M llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll
    M llvm/test/CodeGen/AMDGPU/captured-frame-index.ll
    A llvm/test/CodeGen/AMDGPU/cc-update.ll
    M llvm/test/CodeGen/AMDGPU/cgp-addressing-modes.ll
    M llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
    M llvm/test/CodeGen/AMDGPU/collapse-endcf.ll
    M llvm/test/CodeGen/AMDGPU/control-flow-fastregalloc.ll
    M llvm/test/CodeGen/AMDGPU/cross-block-use-is-not-abi-copy.ll
    M llvm/test/CodeGen/AMDGPU/extload-private.ll
    M llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll
    M llvm/test/CodeGen/AMDGPU/fold-fi-mubuf.mir
    M llvm/test/CodeGen/AMDGPU/frame-index-elimination.ll
    M llvm/test/CodeGen/AMDGPU/frame-lowering-entry-all-sgpr-used.mir
    M llvm/test/CodeGen/AMDGPU/frame-lowering-fp-adjusted.mir
    M llvm/test/CodeGen/AMDGPU/function-returns.ll
    M llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props-v3.ll
    M llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props.ll
    M llvm/test/CodeGen/AMDGPU/idot8s.ll
    M llvm/test/CodeGen/AMDGPU/idot8u.ll
    M llvm/test/CodeGen/AMDGPU/indirect-addressing-term.ll
    M llvm/test/CodeGen/AMDGPU/indirect-call.ll
    M llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
    M llvm/test/CodeGen/AMDGPU/ipra.ll
    M llvm/test/CodeGen/AMDGPU/large-alloca-compute.ll
    M llvm/test/CodeGen/AMDGPU/large-alloca-graphics.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.implicit.buffer.ptr.ll
    M llvm/test/CodeGen/AMDGPU/load-hi16.ll
    M llvm/test/CodeGen/AMDGPU/load-lo16.ll
    M llvm/test/CodeGen/AMDGPU/memory-legalizer-load.ll
    M llvm/test/CodeGen/AMDGPU/memory-legalizer-store.ll
    M llvm/test/CodeGen/AMDGPU/memory_clause.ll
    M llvm/test/CodeGen/AMDGPU/mesa3d.ll
    M llvm/test/CodeGen/AMDGPU/mir-print-dead-csr-fi.mir
    M llvm/test/CodeGen/AMDGPU/misched-killflags.mir
    M llvm/test/CodeGen/AMDGPU/mubuf-offset-private.ll
    M llvm/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir
    M llvm/test/CodeGen/AMDGPU/partial-sgpr-to-vgpr-spills.ll
    M llvm/test/CodeGen/AMDGPU/pei-reg-scavenger-position.mir
    M llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr-carry-out.mir
    M llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr-gfx9.mir
    M llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr.mir
    M llvm/test/CodeGen/AMDGPU/private-access-no-objects.ll
    M llvm/test/CodeGen/AMDGPU/private-element-size.ll
    M llvm/test/CodeGen/AMDGPU/rename-independent-subregs-mac-operands.mir
    M llvm/test/CodeGen/AMDGPU/sched-assert-dead-def-subreg-use-other-subreg.mir
    M llvm/test/CodeGen/AMDGPU/sched-handleMoveUp-subreg-def-across-subreg-def.mir
    M llvm/test/CodeGen/AMDGPU/scratch-buffer.ll
    M llvm/test/CodeGen/AMDGPU/scratch-simple.ll
    M llvm/test/CodeGen/AMDGPU/sgpr-spill-wrong-stack-id.mir
    M llvm/test/CodeGen/AMDGPU/shl_add_ptr.ll
    M llvm/test/CodeGen/AMDGPU/si-spill-sgpr-stack.ll
    M llvm/test/CodeGen/AMDGPU/sibling-call.ll
    R llvm/test/CodeGen/AMDGPU/sp-too-many-input-sgprs.ll
    M llvm/test/CodeGen/AMDGPU/spill-agpr.ll
    M llvm/test/CodeGen/AMDGPU/spill-before-exec.mir
    M llvm/test/CodeGen/AMDGPU/spill-empty-live-interval.mir
    M llvm/test/CodeGen/AMDGPU/spill-m0.ll
    M llvm/test/CodeGen/AMDGPU/spill-offset-calculation.ll
    M llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll
    M llvm/test/CodeGen/AMDGPU/stack-realign-kernel.ll
    M llvm/test/CodeGen/AMDGPU/stack-realign.ll
    M llvm/test/CodeGen/AMDGPU/stack-slot-color-sgpr-vgpr-spills.mir
    M llvm/test/CodeGen/AMDGPU/store-hi16.ll
    M llvm/test/CodeGen/AMDGPU/subreg-split-live-in-error.mir
    M llvm/test/CodeGen/AMDGPU/subvector-test.mir
    M llvm/test/CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot.ll
    M llvm/test/CodeGen/AMDGPU/virtregrewrite-undef-identity-copy.mir
    M llvm/test/CodeGen/AMDGPU/wqm.ll
    M llvm/test/CodeGen/AMDGPU/wwm-reserved.ll
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info.ll
    R llvm/test/CodeGen/MIR/AMDGPU/mfi-parse-error-scratch-wave-offset-reg.mir
    R llvm/test/CodeGen/MIR/AMDGPU/mfi-scratch-wave-offset-reg-class.mir
    M llvm/test/CodeGen/MIR/AMDGPU/parse-order-reserved-regs.mir
    M llvm/test/DebugInfo/AMDGPU/variable-locations.ll

  Log Message:
  -----------
  [AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions

Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in
the entry function prologue. This allows us to removes the scratch wave
offset register from the calling convention ABI.

As part of this change, allow the use of an inline constant zero for the
SOffset of MUBUF instructions accessing the stack in entry functions
when a frame pointer is not requested/required. Entry functions with
calls still need to set up the calling convention ABI stack pointer
register, and reference it in order to address arguments of called
functions. The ABI stack pointer register remains unswizzled, but is now
wave-relative instead of queue-relative.

Non-entry functions also use an inline constant zero SOffset for
wave-relative scratch access, but continue to use the stack and frame
pointers as before. When the stack or frame pointer is converted to a
swizzled offset it is now scaled directly, as the scratch wave offset no
longer needs to be subtracted first.

Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling
convention.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75138


  Commit: 0e9368cc8ca3f1b53926fb82906bc29190516c0f
      https://github.com/llvm/llvm-project/commit/0e9368cc8ca3f1b53926fb82906bc29190516c0f
  Author: Scott Linder <Scott.Linder at amd.com>
  Date:   2020-03-19 (Thu, 19 Mar 2020)

  Changed paths:
    M llvm/docs/AMDGPUUsage.rst
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
    M llvm/test/CodeGen/AMDGPU/call-graph-register-usage.ll
    M llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
    M llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll
    M llvm/test/CodeGen/AMDGPU/callee-special-input-sgprs.ll
    M llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll
    M llvm/test/CodeGen/AMDGPU/cc-update.ll
    M llvm/test/CodeGen/AMDGPU/cross-block-use-is-not-abi-copy.ll
    M llvm/test/CodeGen/AMDGPU/frame-index-elimination.ll
    M llvm/test/CodeGen/AMDGPU/frame-lowering-fp-adjusted.mir
    M llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
    M llvm/test/CodeGen/AMDGPU/nested-calls.ll
    M llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr-carry-out.mir
    M llvm/test/CodeGen/AMDGPU/sibling-call.ll
    M llvm/test/CodeGen/AMDGPU/spill-csr-frame-ptr-reg-copy.ll
    M llvm/test/CodeGen/AMDGPU/stack-realign.ll
    M llvm/test/CodeGen/AMDGPU/wave32.ll
    M llvm/test/CodeGen/MIR/AMDGPU/machine-function-info.ll

  Log Message:
  -----------
  [AMDGPU] Move frame pointer from s34 to s33

Remove the gap left between the stack pointer (s32) and frame pointer
(s34) now that the scratch wave offset is no longer a part of the
calling convention ABI.

Update llvm/docs/AMDGPUUsage.rst to reflect the change.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75657


Compare: https://github.com/llvm/llvm-project/compare/430c9a80c17b...0e9368cc8ca3


More information about the All-commits mailing list