[PATCH] D98169: [PoC][IR] Permit load/store/alloca for struct with the same scalable vectors.

Hsiangkai Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 20 08:30:52 PDT 2021


HsiangKai added a comment.

In D98169#2622089 <https://reviews.llvm.org/D98169#2622089>, @sdesmalen wrote:

> In D98169#2619874 <https://reviews.llvm.org/D98169#2619874>, @craig.topper wrote:
>
>> We want this to support the segment load/store intrinsics defined here https://github.com/riscv/rvv-intrinsic-doc/blob/master/intrinsic_funcs/03_vector_load_store_segment_instructions_zvlsseg.md  These return 2 to 8 vectors that have been loaded into consecutive registers. I believe SVE has similar instructions. I believe SVE represents these using types wider than their normal scalable vector types and relies on the type legalizer to split them up in the backend. This works for SVE because there is only one known minimum size for all scalable vector types so the type legalizer will always split down to that minimum type.
>
> Thanks for providing the context!
>
>> For RISC-V vectors we already use 7 different sizes of scalable vectors to represent the ability of our instructions to operate on 2, 4, or 8 registers simultaneously. And for 1/2, 1/4, and 1/8 fractional registers.  The segment load/store instructions add an extra dimension where they can produce/consume 2, 3, or 4 pairs of registers or 2 quadruples, for examples. Following the SVE strategy would give us ambiguous types for the type legalizer.
>
> How does that look in terms of IR? Is the number of registers somehow represented in the (LLVM IR) vector type? Or are the types the same, but the compiler generates different code depending on what mode is set? For SVE we know we can split the vector because <vscale x 8 x i32> is twice the size of <vscale x 4 x i32>, regardless of the value for vscale. Indeed we know SVE vectors area multiple of 128bits, and therefore that <vscale x 4 x i32> is legal. In order to make any assumptions about splitting/legalization, the compiler will need to know which types are legal, and so would expect the compiler to know the mode (2, 4 ,8) for RVV when generating the code, and therefore have similar knowledge about which types are legal and how the vectors are represented/split into registers. How does that lead to ambiguous types?
>
>> To solve this we would like to use a struct for the segment load/stores to separate them in IR. Since clang needs an address for every variable and needs to be able to load/store them we need to support load/store/alloca.
>
> These (C/C++-level) intrinsics are probably implemented using target-specific intrinsics or perhaps a common LLVM IR intrinsic like masked.load, which should be able to take/return a struct with scalable members after D94142 <https://reviews.llvm.org/D94142>. If so, it should be possible to handle this in Clang by emitting `extractvalue` instructions and storing each member individually. That would avoid any changes to LLVM IR. Is that something you've considered?

We have defined types containing multiple scalable vectors and we permit users to use these types to define auto variables. That is why we need load, store and alloca capabilities for scalable structure.

> If we do need to make this work for scalable vectors, I think it needs a message to the mailing list because it's a change to the LangRef and capabilities of scalable vectors, given previous discussions on this topic. I'd like to avoid giving the impression that we're quietly moving the goalpost on what scalable vectors can do in IR.

I have posted a RFC for the proposal in the mailing list.
https://groups.google.com/g/llvm-dev/c/6ZK2eS4-8t0/m/PG6H1NNDBAAJ


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98169/new/

https://reviews.llvm.org/D98169



More information about the llvm-commits mailing list