[llvm] [X86] Handle repeated blend mask in combineConcatVectorOps (PR #82155)

Sun Feb 18 00:18:33 PST 2024

================
@@ -55226,6 +55226,11 @@ static SDValue combineConcatVectorOps(const SDLoc &DL, MVT VT,
       if (NumOps == 2 && VT.is512BitVector() && Subtarget.useBWIRegs()) {
         uint64_t Mask0 = Ops[0].getConstantOperandVal(2);
         uint64_t Mask1 = Ops[1].getConstantOperandVal(2);
+        // MVT::v16i16 has repeated blend mask.
+        if (Op0.getSimpleValueType() == MVT::v16i16) {
----------------
XinWang10 wrote:

I think it's only for v16i16, like the example
`t87: v16i16 = X86ISD::BLENDI t132, t58, TargetConstant:i8<-86>`.
we have 16 elements but only 8bit mask, in this case the 8bit mask would be reused to select the higher 8 elements(this may related to instruction VEX.256 VPBLENDW), in other cases we have enough bits to select.
I also find clue in the code, you could grep X86ISD::BLENDI and find existing comment
`    // blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) to narrower types.
    // TODO: Handle MVT::v16i16 repeated blend mask.`

https://github.com/llvm/llvm-project/pull/82155