[PATCH] D70862: [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-A

Tim Northover via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Dec 2 01:28:34 PST 2019


t.p.northover added a comment.

Why are you only implementing rot90 and rot270 intrinsics? My quick calculations made rot0 and rot90 the natural ones to implement a bog-standard complex multiplication, but even if I slipped up there I'd expect the others to be useful in some situations.



================
Comment at: clang/include/clang/Basic/arm_neon.td:1687
+  def VCADD_ROT270  : SInst<"vcadd_rot270", "...", "f">;
+  def VCADDQ_ROT90  : SInst<"vcaddq_rot90", "QQQ", "f">;
+  def VCADDQ_ROT270 : SInst<"vcaddq_rot270", "QQQ", "f">;
----------------
I take it you can't fuse this with vcadd_rot90 because NeonEmitter tries to call it vcadd_rot90q? If so, I think your solution is reasonable, the rotations are a tiny edge-case in the ISA.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70862/new/

https://reviews.llvm.org/D70862





More information about the cfe-commits mailing list