[PATCH] D97459: [CodeGen] Fix issues with subvector intrinsic index types

Thu Feb 25 04:44:27 PST 2021

frasercrmck added inline comments.

================
Comment at: llvm/include/llvm/CodeGen/ISDOpcodes.h:1049

-  /// UBSANTRAP - Trap with an immediate describing the kind of sanitizer failure.
+  /// UBSANTRAP - Trap with an immediate describing the kind of sanitizer
+  /// failure.
----------------
sdesmalen wrote:
> nit: unrelated change.
True. It's something `arcanist` prompted me about when creating the patch. Should probably just be pre-committed?

================
Comment at: llvm/test/CodeGen/AArch64/vecreduce-and-legalization.ll:104
+; CHECK-NEXT:    mov v1.b[11], w8
+; CHECK-NEXT:    mov v1.b[13], w8
+; CHECK-NEXT:    ext v1.16b, v1.16b, v1.16b, #8
----------------
sdesmalen wrote:
> Probably not something that's caused by your patch, but I would have expected that elements `v1.b[12], v1.b[14] and v1.b[15]` to also be set to `-1`?
Perhaps. I'll admit I don't know much about AArch64 but that makes sense. My patch has taken us from:

```
  t0: ch = EntryToken
  t2: v16i8,ch = CopyFromReg t0, Register:v16i8 %0
  t13: v16i8 = insert_vector_elt t2, Constant:i32<-1>, Constant:i64<9>
  t15: v16i8 = insert_vector_elt t13, Constant:i32<-1>, Constant:i64<10>
  t17: v16i8 = insert_vector_elt t15, Constant:i32<-1>, Constant:i64<11>
  t19: v16i8 = insert_vector_elt t17, Constant:i32<-1>, Constant:i64<12>
  t21: v16i8 = insert_vector_elt t19, Constant:i32<-1>, Constant:i64<13>
  t29: v8i8 = extract_subvector t21, Constant:i64<0>
  t84: v16i8 = insert_subvector undef:v16i8, t29, Constant:i32<0>
  t86: i32 = extract_vector_elt t84, Constant:i64<6>
  // and other extracts from t84 of indices < 8
  // use t21 elsewhere
```

to

```
  t0: ch = EntryToken
  t2: v16i8,ch = CopyFromReg t0, Register:v16i8 %0
  t13: v16i8 = insert_vector_elt t2, Constant:i32<-1>, Constant:i64<9>
  t15: v16i8 = insert_vector_elt t13, Constant:i32<-1>, Constant:i64<10>
  t17: v16i8 = insert_vector_elt t15, Constant:i32<-1>, Constant:i64<11>
  t19: v16i8 = insert_vector_elt t17, Constant:i32<-1>, Constant:i64<12>
  t21: v16i8 = insert_vector_elt t19, Constant:i32<-1>, Constant:i64<13>
  t86: i32 = extract_vector_elt t2, Constant:i64<6>
  // and other extracts from t2 of indices < 8
  // use t21 elsewhere
```

Which I think is a good thing, since there was previously a redundant extract/insert going on, and a dependency on a vector whose elements different from `t2` weren't demanded.

Now as to *why* this is happening, there's this code in `SelectionDAG.cpp` when creating an `ISD::INSERT_SUBVECTOR`:

```
    // If this is an insert of an extracted vector into an undef vector, we
    // can just use the input to the extract.
    if (N1.isUndef() && N2.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
        N2.getOperand(1) == N3 && N2.getOperand(0).getValueType() == VT)
      return N2.getOperand(0);
```

And I think it would be here `N2.getOperand(1) == N3` failed because you were mixing `i32` and `i64` types between insertion and extraction.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97459/new/

https://reviews.llvm.org/D97459