[llvm-dev] [RFC] carry-less multiplication instruction
Shawn Landden via llvm-dev
llvm-dev at lists.llvm.org
Mon Jul 6 04:42:44 PDT 2020
05.07.2020, 13:44, "Craig Topper" <craig.topper at gmail.com>:
> Shawn,
>
> Are you able to summarize the different instructions from the various targets. It looks like there different implementation choices made for each target. For example, X86 takes two v2i64 inputs and picks either an even or odd element from each to multiply to produceĀ a v1i128 result. It looks like RISC-V has instructions to produce either the high half of the result or the low half of the result. Those are the only two I checked.
>
> Will a common intrinsic need custom handling for each target or is there a common version that multiple targets use that we should choose for the intrinsic?
>
Only the Power8 instructions are differen't, as it can do two 64+64=>128 multiplications at the same time, with the result xored together (Karatsuba-style), and if you don't want that you have to make sure you are multiplying by zero for one of them. So Power would require special lowing for 128+128=>256 multiply. So with Power you get 3 multiplys, but you have to zero some registers, while on RISC-V you would have 4 right after each other, and them have to xor the middle two.
More information about the llvm-dev
mailing list