[all-commits] [llvm/llvm-project] fdaa2d: [AMDGPU] Use divergent addresses for vector loads

Tue Feb 23 05:34:00 PST 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: fdaa2d02591b10c96ca8705041dfc75fcbf97095
      https://github.com/llvm/llvm-project/commit/fdaa2d02591b10c96ca8705041dfc75fcbf97095
  Author: Jay Foad <jay.foad at amd.com>
  Date:   2021-02-23 (Tue, 23 Feb 2021)

  Changed paths:
    M llvm/test/CodeGen/AMDGPU/idot2.ll
    M llvm/test/CodeGen/AMDGPU/idot4s.ll
    M llvm/test/CodeGen/AMDGPU/idot4u.ll
    M llvm/test/CodeGen/AMDGPU/idot8s.ll
    M llvm/test/CodeGen/AMDGPU/idot8u.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.buffer.load.ll
    M llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll

  Log Message:
  -----------
  [AMDGPU] Use divergent addresses for vector loads

Change some test cases to use divergent addresses for vector loads,
which should be the common case in real world code. Using uniform
addresses causes poor instruction selection for the surrounding
code which has to be fixed up post-register-allocation, and this causes
a lot of testsuite churn for a forthcoming patch to stop selecting
24-bit vector multiply instructions for uniform multiplies.

This shows up some problems in the idot tests where we fail to select
v_dot instructions because the patterns only match MUL_[UI]24 ISD nodes,
but the DAG contains i16 mul nodes instead.

Differential Revision: https://reviews.llvm.org/D97062