[all-commits] [llvm/llvm-project] 5bd1fe: [AMDGPU] Fix alignment requirements for 96bit and ...

Fri Aug 21 03:31:08 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 5bd1febe214f166b93d95ca3007bcb9318c3ae79
      https://github.com/llvm/llvm-project/commit/5bd1febe214f166b93d95ca3007bcb9318c3ae79
  Author: Mirko Brkusanin <Mirko.Brkusanin at amd.com>
  Date:   2020-08-21 (Fri, 21 Aug 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPU.td
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
    M llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
    M llvm/test/CodeGen/AMDGPU/ds-combine-with-dependence.ll
    M llvm/test/CodeGen/AMDGPU/ds_read2.ll
    M llvm/test/CodeGen/AMDGPU/ds_write2.ll
    M llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
    M llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/multiple_tails.ll

  Log Message:
  -----------
  [AMDGPU] Fix alignment requirements for 96bit and 128bit local loads and stores

Adjust alignment requirements for ds_read/write_b96/b128.
GFX9 and onwards allow misaligned access for reads and writes but only if
SH_MEM_CONFIG.alignment_mode allows it.
UnalignedDSAccess is set on GCN subtargets from GFX9 onward to let us know if we
can relax alignment requirements.
UnalignedAccessMode acts similary to UnalignedBufferAccess for DS instructions
but only from GFX9 onward and is supposed to match alignment_mode. By default
alignment of 4 is required.

Differential Revision: https://reviews.llvm.org/D82788

  Commit: f5cd7ec9f3fc969ff5e1feed961996844333de3b
      https://github.com/llvm/llvm-project/commit/f5cd7ec9f3fc969ff5e1feed961996844333de3b
  Author: Mirko Brkusanin <Mirko.Brkusanin at amd.com>
  Date:   2020-08-21 (Fri, 21 Aug 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPU.td
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
    M llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/load-constant.96.ll
    M llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll
    M llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
    M llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll
    M llvm/test/CodeGen/AMDGPU/unaligned-load-store.ll
    M llvm/test/CodeGen/MIR/AMDGPU/llc-target-cpu-attr-from-cmdline-ir.mir
    M llvm/test/CodeGen/MIR/AMDGPU/llc-target-cpu-attr-from-cmdline.mir
    M llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/adjust-alloca-alignment.ll
    M llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll

  Log Message:
  -----------
  [AMDGPU] Reorganize GCN subtarget features for unaligned access

Features UnalignedBufferAccess and UnalignedDSAccess are now used to determine
whether hardware supports such access.
UnalignedAccessMode should be used to enable them.
hasUnalignedBufferAccessEnabled() and hasUnalignedDSAccessEnabled() can be
now used to quickly check both.

Differential Revision: https://reviews.llvm.org/D84522

  Commit: d17ea67b92f6611a169dbd4a1399664078283648
      https://github.com/llvm/llvm-project/commit/d17ea67b92f6611a169dbd4a1399664078283648
  Author: Mirko Brkusanin <Mirko.Brkusanin at amd.com>
  Date:   2020-08-21 (Fri, 21 Aug 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPU.td
    M llvm/lib/Target/AMDGPU/AMDGPUGISel.td
    M llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h
    M llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
    M llvm/lib/Target/AMDGPU/DSInstructions.td
    M llvm/lib/Target/AMDGPU/SIInstrInfo.td
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
    A llvm/test/CodeGen/AMDGPU/GlobalISel/load-local.128.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/load-local.96.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/load-unaligned.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/store-local.128.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/store-local.96.ll

  Log Message:
  -----------
  [AMDGPU][GlobalISel] Fix 96 and 128 local loads and stores

Fix local ds_read/write_b96/b128 so they can be selected if the alignment
allows. Otherwise, either pick appropriate ds_read2/write2 instructions or break
them down.

Differential Revision: https://reviews.llvm.org/D81638

  Commit: 0654ff703d4e99423133165db63083b831efb9b6
      https://github.com/llvm/llvm-project/commit/0654ff703d4e99423133165db63083b831efb9b6
  Author: Mirko Brkusanin <Mirko.Brkusanin at amd.com>
  Date:   2020-08-21 (Fri, 21 Aug 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/ds_read2.ll
    M llvm/test/CodeGen/AMDGPU/ds_write2.ll
    M llvm/test/CodeGen/AMDGPU/insert-subvector-unused-scratch.ll
    M llvm/test/CodeGen/AMDGPU/lds-misaligned-bug.ll
    M llvm/test/CodeGen/AMDGPU/load-local-f32.ll
    M llvm/test/CodeGen/AMDGPU/load-local-i16.ll
    M llvm/test/CodeGen/AMDGPU/load-local-i32.ll
    M llvm/test/CodeGen/AMDGPU/load-local-i8.ll
    A llvm/test/CodeGen/AMDGPU/load-local.128.ll
    A llvm/test/CodeGen/AMDGPU/load-local.96.ll
    A llvm/test/CodeGen/AMDGPU/store-local.128.ll
    A llvm/test/CodeGen/AMDGPU/store-local.96.ll
    M llvm/test/CodeGen/AMDGPU/store-local.ll

  Log Message:
  -----------
  [AMDGPU] Use ds_read/write_b96/b128 when possible for SDag

Do not break down local loads and stores so ds_read/write_b96/b128 in
ISelLowering can be selected on subtargets that support them and if align
requirements allow them.

Differential Revision: https://reviews.llvm.org/D84403

Compare: https://github.com/llvm/llvm-project/compare/c66b82f14cc7...0654ff703d4e