[all-commits] [llvm/llvm-project] 58a2cb: [GlobalISel] Add a new artifact combiner for unmer...

Fri Jul 9 23:01:32 PDT 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 58a2cb51436659cce341f3039414de88a3527332
      https://github.com/llvm/llvm-project/commit/58a2cb51436659cce341f3039414de88a3527332
  Author: Amara Emerson <amara at apple.com>
  Date:   2021-07-09 (Fri, 09 Jul 2021)

  Changed paths:
    M llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
    M llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
    M llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
    A llvm/test/CodeGen/AArch64/GlobalISel/artifact-find-value.mir
    M llvm/test/CodeGen/AArch64/GlobalISel/legalize-inserts.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-and.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-freeze.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-or.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-select.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-xor.mir

  Log Message:
  -----------
  [GlobalISel] Add a new artifact combiner for unmerge which looks through general artifact expressions.

The original motivation for this was to implement moreElementsVector of shuffles
on AArch64, which resulted in complex sequences of artifacts like unmerge(unmerge(concat...))
which the combiner couldn't handle. It seemed here that the better option,
instead of writing ever-more-complex combines, was to have a way to find
the original "non-artifact" source registers for a given definition, walking
through arbitrary expressions of unmerge/concat/insert. As long as the bits
aren't extended or truncated, this is a pretty simple algorithm that avoids
the need for lots of combines and instead jumps straight to the final result
we want.

I've only used this new technique in 2 places within tryCombineUnmerge, using it
in more general situations resulted in infinite loops in AMDGPU. So for now
it's used when we would otherwise fail to combine and that seems to work.

In order to support looking through G_INSERTs, I also had to add it as an
artifact in isArtifact(), which caused a whole lot of issues in tests. AMDGPU
started infinite looping since full legalization of G_INSERT doensn't seem to
be there. To work around this, I've temporarily added a CLI option to use the
old behaviour so that the MIR tests will still run and terminate.

Other minor changes include no longer making >128b G_MERGE/UNMERGE legal.
We never had isel support for that anyway and it was a remnant of the legacy
legalizer rules. However being legal prevented the combiner from checking if it
was dead and deleting them.

Differential Revision: https://reviews.llvm.org/D104355