[llvm] [ISel] Introduce llvm.clmul intrinsic (PR #168731)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 27 13:18:38 PST 2025
davemgreen wrote:
Looking at https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3642r2.html#hardware-support and from what I can tell of other architectures, x86 has a pclmulqdq instruction that performs clmul(zext(a), zext(b)), of i64->i128 if I have that correct. There is an integer on the operations that picks between which of the top/bottom halves of the input vector to use but it doesn't change the behaviour.
Arm and AArch64 are the ones I am most familiar with, which has a PMUL that performs v16i8->v16i8 and a PMULL that performs v8i8->v8i16 or i64->i128 multiplies. There are also SVE variants that work the same as far as I can tell, so are too performing clmul(zext, zext).
PowerPC seems to have VPMSUM that performs multiple clmul(zext, zext) and xor them together, if I am reading them correctly. RiscV has the two/three operations that perform the bottom and top halves separately. Having a generic DAG combine that looks to turn clmul(zext, zext) into clmulh sounds like it would not be beneficial in many targets, without a check that it is worthwhile. If you plan to add a RISCVISD::CLMUL could it perform the optimization to produce it directly?
https://github.com/llvm/llvm-project/pull/168731
More information about the llvm-commits
mailing list