[llvm-dev] RFC: Complex in LLVM

Tue Jul 2 10:04:38 PDT 2019

-----Original Message-----
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of David Greene via llvm-dev
Sent: Monday, July 1, 2019 1:56 PM
To: llvm-dev at lists.llvm.org
Subject: [EXT] [llvm-dev] RFC: Complex in LLVM

> [...]
>
> Vectorization results in many shufflevector operations to massage the data into sequences suitable for vector arithmetic.
>
> [...]

This is the important part, and there is nothing in this RFC that helps alleviate it.

Vectorization must know the data layout: whether we have vectors (r1, i1, r2, i2...) or (r1, r2, ...), (i1, i2, ...).  These two approaches are not compatible.  If you have vector registers that can hold 8 floats, with the first approach you can load 4 complex numbers in a single instruction, then multiply by another 4 numbers, and store.  With the second approach, the minimum unit of work is 8 numbers, and each input to the multiplication has to be loaded in two instructions, loading real and imaginary parts from two separate locations.  On most architectures the second approach would be vastly superior, but the v4c32 type mentioned in the RFC suggests the first one.

In addition to that, we shouldn't limit complex types to floating point only.  What we care about is keeping the "ac-bd" together, not what type a,b,c,d are.

-Krzysztof