[all-commits] [llvm/llvm-project] e501ed: [AMDGPU] Don't flush vmcnt for loops with use/def ...

Fri Jun 2 23:05:53 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: e501ed84aa4768e7008c6127e8573788dcee31ee
      https://github.com/llvm/llvm-project/commit/e501ed84aa4768e7008c6127e8573788dcee31ee
  Author: Austin Kerbow <Austin.Kerbow at amd.com>
  Date:   2023-06-02 (Fri, 02 Jun 2023)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/fp64-atomics-gfx90a.ll
    M llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll
    M llvm/test/CodeGen/AMDGPU/atomicrmw-nand.ll
    M llvm/test/CodeGen/AMDGPU/flat_atomics_i64_min_max_system.ll
    M llvm/test/CodeGen/AMDGPU/flat_atomics_min_max_system.ll
    M llvm/test/CodeGen/AMDGPU/fp64-atomics-gfx90a.ll
    M llvm/test/CodeGen/AMDGPU/global-load-saddr-to-vaddr.ll
    M llvm/test/CodeGen/AMDGPU/global-saddr-atomics-min-max-system.ll
    M llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir

  Log Message:
  -----------
  [AMDGPU] Don't flush vmcnt for loops with use/def pairs

Conditions for hoisting vmcnt with flat instructions should be similar to VMEM.
If there are use/def pairs in a loop body we cannot guarantee that hosting the
waitcnt will be profitable. Better heuristics are needed to analyse whether
gains from avoiding waitcnt in loop bodys outweighs waiting for loads in the
preheader.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D151126