[PATCH] D70862: [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-A
Tim Northover via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Dec 2 01:28:34 PST 2019
t.p.northover added a comment.
Why are you only implementing rot90 and rot270 intrinsics? My quick calculations made rot0 and rot90 the natural ones to implement a bog-standard complex multiplication, but even if I slipped up there I'd expect the others to be useful in some situations.
================
Comment at: clang/include/clang/Basic/arm_neon.td:1687
+ def VCADD_ROT270 : SInst<"vcadd_rot270", "...", "f">;
+ def VCADDQ_ROT90 : SInst<"vcaddq_rot90", "QQQ", "f">;
+ def VCADDQ_ROT270 : SInst<"vcaddq_rot270", "QQQ", "f">;
----------------
I take it you can't fuse this with vcadd_rot90 because NeonEmitter tries to call it vcadd_rot90q? If so, I think your solution is reasonable, the rotations are a tiny edge-case in the ISA.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70862/new/
https://reviews.llvm.org/D70862
More information about the cfe-commits
mailing list