[llvm-dev] Complex proposal v3 + roundtable agenda
Simon Moll via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 25 01:44:26 PST 2020
Here is completely immature idea that may at least serve as food for
thought:
Why force complex intrinsics to commit to any specific datalayout at all?
We could just say that the complex number is represented as/backed by a
vector and leave the choice on which vector element is which complex
number element entirely to the targets.
You'd need to have conversion intrinsics and/or converting complex
load/store that convert from a specific layout to the internal
representation.
For example:
%v = <16 x double> llvm.complex.load.v16f64(%ptr, "datalayout
specifier")
; loads a complex number with a specific dl into the target-specific
representation
%r = llvm.complex.fmul(%x, %other_complex)
; Complex multiply - the target may assume that the inputs have the
target's complex number layout.
%s = fadd %r, %other_vector
; We don't know which element is which but we do know that each
vector element is a complex number element.. so addition still works.
llvm.complex.store.v16f64(%ptr, "dl specifier again")
Normal load/store would even allow that target's to decide how complex
numbers are represented in memory:
store %ptr, <16 x f64> %complex_vector
%complex_vector = load <16 x f64> %ptr
Further, you'd want intrinsics to extract the real and complex parts to
allow mixing of lowered complex code with complex intrinsics like so:
%real_part = llvm.complex.extract_real.v16f64(%complex_numer)
%imm_part = llvm.complex.extract_imm.v16f64(%complex_numer)
%complex_number_with_target_representation =
llvm.complex.get(%real_part, %imm_part)
- Simon
On 11/24/20 10:16 PM, Florian Hahn via llvm-dev wrote:
>
>> On Nov 19, 2020, at 18:11, Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> On Wed, Nov 18, 2020 at 4:47 PM Krzysztof Parzyszek via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>> Complex type would pose another issue for vectorization: in general it's better to have a vector of the real parts and a vector of the imaginary parts for the purpose of arithmetic, than having vectors of complex elements (i.e. real and imaginary parts interleaved).
>> Is that universally true? I think it depends on the target. Let's take
>> Florian's FCMLA example. The inputs and output are interleaved. And if
>> you need just the reals/imags from an interleaved vector for something
>> else, LD2/ST2 should be pretty fast on recent chips.
>>
>> On the other hand, if we had a non-interleaved complex representation
>> and wanted to use FCMLA, we'd need some number of zips and unzips to
>> interleave and deinterleave between the load and store. Those probably
>> aren't cheap in aggregate.
>>
>> I haven't studied this across all targets, but my intuition says we
>> should leave the representation decision up to the targets. Maybe we
>> should have a larger discussion about it.
>
> I think that’s a key point. I think the set of instructions for complex math out there are probably much more varied than other parts of vector extensions.
>
> This is another area that would benefit from making progress early and iterating on the details with people interested in different targets IMO. For example, we could have intrinsics for both layouts and decide which one to pick depending on the target. Converting between layouts should be easily doable in IR as well, although at some cost.
>
> Cheers,
> Florian
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
More information about the llvm-dev
mailing list