[all-commits] [llvm/llvm-project] 538a8f: [InstCombine] convert bitcast-shuffle to vector trunc

Sun Apr 5 06:48:21 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 538a8f02271b6de817a6b65e3b70f9f1fd6e428d
      https://github.com/llvm/llvm-project/commit/538a8f02271b6de817a6b65e3b70f9f1fd6e428d
  Author: Sanjay Patel <spatel at rotateright.com>
  Date:   2020-04-05 (Sun, 05 Apr 2020)

  Changed paths:
    M llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
    M llvm/test/Transforms/InstCombine/shuffle-cast.ll
    M llvm/test/Transforms/PhaseOrdering/vector-trunc.ll

  Log Message:
  -----------
  [InstCombine] convert bitcast-shuffle to vector trunc

As discussed in D76983, that patch can turn a chain of insert/extract
with scalar trunc ops into bitcast+extract and existing instcombine
vector transforms end up creating a shuffle out of that (see the
PhaseOrdering test for an example). Currently, that process requires
at least this sequence: -instcombine -early-cse -instcombine.

Before D76983, the sequence of insert/extract would reach the SLP
vectorizer and become a vector trunc there.

Based on a small sampling of public targets/types, converting the
shuffle to a trunc is better for codegen in most cases (and a
regression of that form is the reason this was noticed). The trunc is
clearly better for IR-level analysis as well.

This means that we can induce "spontaneous vectorization" without
invoking any explicit vectorizer passes (at least a vector cast op
may be created out of scalar casts), but that seems to be the right
choice given that we started with a chain of insert/extract, and the
backend would expand back to that chain if a target does not support
the op.

Differential Revision: https://reviews.llvm.org/D77299