[all-commits] [llvm/llvm-project] dbf278: [AMDGPU] Prevent aliasing of SrcC and Dst in MAI

Stanislav Mekhanoshin via All-commits all-commits at lists.llvm.org
Wed Jan 26 14:51:26 PST 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: dbf278b984eebf17a8977e79b67cdb960e9af5db
      https://github.com/llvm/llvm-project/commit/dbf278b984eebf17a8977e79b67cdb960e9af5db
  Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
  Date:   2022-01-26 (Wed, 26 Jan 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
    M llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
    M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
    M llvm/lib/Target/AMDGPU/SIInstrInfo.h
    M llvm/lib/Target/AMDGPU/SIInstrInfo.td
    M llvm/lib/Target/AMDGPU/VOP3PInstructions.td
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.ll
    A llvm/test/CodeGen/AMDGPU/mfma-no-register-aliasing.ll
    M llvm/test/CodeGen/AMDGPU/spill-agpr.ll

  Log Message:
  -----------
  [AMDGPU] Prevent aliasing of SrcC and Dst in MAI

Form the MAI spec: It’s ok that Src_C and vDst are the exact same VGPRs
or Src_C and vDst are completely separated. The case that Src_C and vDst
are overlapping should be avoid as new value could be written to accumulator
input before it gets read.

Note that this inevitably increases register pressure to the point where
some programs will become uncompilable.

This patch separates MAC and FMA versions of MFMA instructions using either
tied dst and src2 or earlyclobber dst.

Fixes: SWDEV-318900

Differential Revision: https://reviews.llvm.org/D117844




More information about the All-commits mailing list