[PATCH] D75674: [AArch64][SVE] Implement vector tuple intrinsics

Fri May 22 10:10:47 PDT 2020

c-rhodes added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:7038
+    assert(Src1.getOpcode() == ISD::CONCAT_VECTORS &&
+           "Unexpected operand for sve_tuple_get");
+
----------------
efriedma wrote:
> c-rhodes wrote:
> > efriedma wrote:
> > > This assertion seems ambitious; I'm not sure how you plan to ensure that the sve.tuple.get intrinsic is in the same basic block as the intrinsic that produces the value.
> > > 
> > > Is there some reason you're putting this handling into the target-independent SelectionDAGBuilder, as opposed to target-specific code?
> > > This assertion seems ambitious; I'm not sure how you plan to ensure that the sve.tuple.get intrinsic is in the same basic block as the intrinsic that produces the value.
> > 
> > Sander's patch D80139 fixes copying between BBs and I've updated this patch to use `EXTRACT_SUBVECTOR` to get the vector from the tuple type, rather than looking through the `CONCAT_VECTOR`, so this assert has been removed.
> > 
> > > Is there some reason you're putting this handling into the target-independent SelectionDAGBuilder, as opposed to target-specific code?
> > 
> > Tuple types for structured loads/stores were originally implemented with sizeless structs downstream but adding sizeless structs to the C standard isn't going to happen, so we chose to represent them as multiples of 128-bit vectors, e.g. `svint32x2_t` -> `<n x 8 x i32>`. For vector tuples containing 2 and 4 vectors (LD2/LD4, ST2/ST4) this was fine as the types are powers of 2 and can be broken down by the existing mechanisms, but for LD3/ST3 the types are odd sizes:
> > 
> >   - `svint8x3_t`   -> `<vscale x 48 x i8>`
> >   - `svint16x3_t` -> `<vscale x 24 x i16>`
> >   - `svint32x3_t` -> `<vscale x 12 x i32>`
> >   - `svint64x2_t` -> `<vscale x 6 x i64>`
> > 
> > Which are problematic for legalisation when it comes to widening/splitting. To do this lowering in target-specific AArch64ISelLowering MVTs would have to be defined for these types which aren't legal. By lowering here we avoid these issues as these types don't reach type legalization.
> If you need to run before legalization, you can put code in AArch64TargetLowering::PerformDAGCombine.
> 
> Alternatively, additional MVTs wouldn't be that terrible.
> If you need to run before legalization, you can put code in AArch64TargetLowering::PerformDAGCombine.

That's a much better idea :) I've implemented your suggestion, cheers!

> Alternatively, additional MVTs wouldn't be that terrible.
I'm a little confused about this, there's a comment at the top of `MachineValueType.h`:

```// This file defines the set of machine-level target independent types which
// legal values in the code generator use.```

which to me implies MVTs should only be implemented for legal types which tuple vectors aren't, although I've also seen `v3i32` which seems to be used in the AMDGPU backend but I don't know if the ISA specifies a register for that. I don't think I want to implements MVTs in this patch but I'm curious if it's ok to define them for illegal types?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75674/new/

https://reviews.llvm.org/D75674