[all-commits] [llvm/llvm-project] 4db742: [AMDGPU] Improve zeroesHigh16BitsOfDest for GFX9 l...

Wed Dec 15 05:15:30 PST 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4db74227719324786824fa76004a06d0147b7a85
      https://github.com/llvm/llvm-project/commit/4db74227719324786824fa76004a06d0147b7a85
  Author: Jay Foad <jay.foad at amd.com>
  Date:   2021-12-15 (Wed, 15 Dec 2021)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/test/CodeGen/AMDGPU/high-bits-zeroed-16-bit-ops.mir

  Log Message:
  -----------
  [AMDGPU] Improve zeroesHigh16BitsOfDest for GFX9 legacy opcodes

Pseudos like V_MAD_U16 and V_FMA_F16 map down to what GFX9 calls
v_mad_legacy_u16 and v_fma_legacy_f16, which are documented to have the
same zeroing behaviour as on GFX8.

Differential Revision: https://reviews.llvm.org/D115729

  Commit: 54fc9eb9b313497cf34ae569391e76db56b65e70
      https://github.com/llvm/llvm-project/commit/54fc9eb9b313497cf34ae569391e76db56b65e70
  Author: Jay Foad <jay.foad at amd.com>
  Date:   2021-12-15 (Wed, 15 Dec 2021)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-add-fma-mul.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-add-mul.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/fma.ll
    M llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll

  Log Message:
  -----------
  [AMDGPU] Use v_fma_f16 on GFX10

Teach convertToThreeAddress to use the V_FMA_F16_gfx9 pseudo (i.e. the
standard instruction in GFX9 onwards) instead of V_FMA_F16 (the legacy
pseudo for GFX8 compatibility, which is no longer supported in GFX10).
This follows the example of macToMad in SIFoldOperands.

Differential Revision: https://reviews.llvm.org/D115731

Compare: https://github.com/llvm/llvm-project/compare/d930c3155c1b...54fc9eb9b313