[all-commits] [llvm/llvm-project] f104eb: [AMDGPU] Reintroduce CC exception for non-inlined ...

Tue May 23 00:01:53 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f104eb6e15503b770734e3a59937c9df865b2814
      https://github.com/llvm/llvm-project/commit/f104eb6e15503b770734e3a59937c9df865b2814
  Author: pvanhout <pierre.vanhoutryve at amd.com>
  Date:   2023-05-23 (Tue, 23 May 2023)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
    M llvm/test/CodeGen/AMDGPU/vector-alloca-limits.ll

  Log Message:
  -----------
  [AMDGPU] Reintroduce CC exception for non-inlined functions in Promote Alloca limits

This is basically a partial revert of https://reviews.llvm.org/D145586 ( fd1d60873fdc )

D145586 was originally introduced to help with SWDEV-363662, and it did, but
it also caused a 25% drop in performance in
some MIOpen benchmarks where, it seems,
functions are inlined more conservatively.

This patch restores the pre-D145586 behavior
for PromoteAlloca: functions with a non-entry CC
have a 32 VGPRs threshold, but only if the function
is not marked with "alwaysinline".

A good number of AMDGPU code makes uses of
the AMDGPUAlwaysInline pass anyway, so in our
backend "alwaysinline" seems very common.

This change does not affect SWDEV-363662 (the motivating issue for introducing D145586).

Fixes SWDEV-399519

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D150551