[PATCH] D73602: [SVE] Add support for lowering GEPs involving scalable vectors.

Mon Feb 10 11:14:17 PST 2020

fpetrogalli added a comment.

In D73602#1864968 <https://reviews.llvm.org/D73602#1864968>, @efriedma wrote:

> > Could you please describe in a comment in the tests the math behind operations like the following?
>
> I'm not sure what you mean by "the math".  A scalar GEP produces a scalar result, a vector GEP produces a vector result.  A scalar operand to a vector GEP is splatted.  I'm not sure what useful explanation I could add; I could add a comment "This is a vector GEP", but I'm not sure that's helpful; the result type is already explicitly written in the "ret" instruction.

Yeah, I was confused by what the GEP was doing, I thought you where somehow adding the values of the fixed length vector to the first lanes of the scalable one. My bad. No need to add any comment.

>> Could you please tests expressions like the following?
> 
> Sure.

Thanks!

>> You tests the cases for <vscale x 2 x i64>. Could you please test more configurations of <vscale x N x Ty>, at least covering all cases in which sizeof( N x Ty) == 16?
> 
> For vector GEPs, all the math always has to happen in N x i64 (since pointers are 64 bits wide).  So if N>2, we have to run type legalization, and that probably explodes in unrelated ways, so I don't really want to do that. I can add more tests for scalar GEPs over vector pointee types; that shouldn't require more patterns, I think.

Yes, I was not asking to modify the types of the RHS argument, I was asking to modify the type of the base pointer, so that you tests things like `getelementptr <vscale x 2 x i8>, <vscale x 2 x i8>*, i64 %whatever`

>> To make it work, I had to add handling of getelementpr when adding a constant int
> 
> What's the difference between the DAG produced by your patch, vs. my patch alone?

I have explained it in the comment of the first example.

================
Comment at: llvm/test/CodeGen/AArch64/sve-gep.ll:7-8
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    rdvl x8, #1
+; CHECK-NEXT:    add x0, x0, x8, lsl #2
+; CHECK-NEXT:    ret
----------------
These two instructions could be replaced by a single one: `addvl x0, x0, #4`. Do you think you can change your patch to generate this sequence, or should this be done in the back-end?

Also, the code I wrote in the patch at https://reviews.llvm.org/D74254 generates the following node for the base address out of this example (well, my patches uses size in bits, but I recon it should be bytes):

```
(ISD::ADD, %base (ISD::VSCALE 16))
```

It essentially encodes the multiplicative constant in the VSCALE node directly (provided that the argument of the MUL is an immediate constant). That makes it very easy to detect in the back-end because it doesn't require to check for MUL/SUB and SHL. Do you think you can emit such DAG in your patch? 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73602/new/

https://reviews.llvm.org/D73602