[llvm-dev] Questions about type based "createBuildVecShuffle()" in DAG Combiner
Eli Friedman via llvm-dev
llvm-dev at lists.llvm.org
Fri Nov 1 12:48:31 PDT 2019
When shufflevector was first introduced to IR and SelectionDAG, it required the output and input types to be identical. The IR shufflevector was later extended to allow arbitrary output types, but the SelectionDAG SHUFFLE_VECTOR was never changed.
This has generally worked out okay for most uses because size-changing shuffles are rare, and a lot of the interesting size-changing shuffles can be expressed in terms of CONCAT_VECTORS or EXTRACT_SUBVECTOR. But maybe it's worth revisiting shuffle representations at the SelectionDAG level.
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Wei Zhao via llvm-dev
Sent: Thursday, October 31, 2019 1:51 PM
To: llvm-dev at lists.llvm.org
Subject: [EXT] [llvm-dev] Questions about type based "createBuildVecShuffle()" in DAG Combiner
In LLVM DAG Combiner, DAGCombiner::createBuildVecShuffle() is type based.
│17184 // We can't generate a shuffle node with mismatched input and output types.
│17185 // Try to make the types match the type of the output.
1. The codes following the above comment are trying to do a matching job between the input vectors and the output vector. Why the code is based on the assumption that only matched type can be allowed to do a vector shuffle?
A shuffle takes some fields of data from the input vector and reassembles them in the output vector. It is purely a data movement operation. The input vector is the container for the source data, and the output vector is the container for the resulting data. Why these two containers have to have the same vector type?
VT's type: v2i16
VecIn1 and VecIn2's type: v4i16
We take two i16 elements, each from VecIn1 and VecIn2 separately. With the current code, because of their type difference, there will be no shuffle generated
The requirement to create a shuffle operation should be: the capacity (SizeInBits) of the output vector can hold all the extracted data from the input vector container
So as long as the total SizeInBits of the input data extracted from the input vectors does not exceed the total SizeInBits of the out vector, the shuffle should be allowed to create. Sure there are some other checks needed like indexes cannot be the same to avoid two data being placed in the same position.
1. Another inconsistence is that the split of the vector right before the createBuildVecShuffle()
│17436 // If all the Operands of BUILD_VECTOR extract from same
│17437 // vector, then split the vector efficiently based on the maximum
│17438 // vector access index and adjust the VectorMask and
│17439 // VecIn accordingly.
This split will create a new vector type which most likely will not be the same as the output vector type. For example, if the previous vector input container and output container both have a type v8i16, after splitting, the input vector will have type v4i16, again this will cause no shuffle being created later by the type based createBuildVecShuffle(), missing some shuffle operations. This type based shuffle node creation makes many optimization error-prone.
Looks like the input/output container type based approach to create a shuffle node will miss some shuffle operations which makes the generated code less efficient.
Any comment why it was first designed like this?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev