[PATCH] D56474: [ARM] [NEON] Add ROTR/ROTL lowering
easyaspi314 (Devin) via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 9 19:53:39 PST 2019
easyaspi314 added a comment.
Definitely either result/writeback cycles.
@ Cy / Re / Wr
vld1.8 { inpLo, inpHi }, [p] @ 2 / 2 / 6
vmla.i32 v, inp, prime2 @ 4 / 9 / 9
vshr.u32 v2, v, #19 @ 1 / 3 / 6
vsli.32 v2, v, #13 @ 2 / 4 / 7
vmul.i32 v, v2, prime1 @ 4 / 9 / 9
@ 13 / 27 / 37
vld1.8 { inpLo, inpHi }, [p] @ 2 / 2 / 6
vmla.i32 v, inp, prime2 @ 4 / 9 / 9
vshr.u32 tmp, v, #19 @ 1 / 3 / 6
vshl.i32 v, v, #13 @ 1 / 3 / 6
vorr v, v, tmp @ 1 / 3 / 6
vmul.i32 v, v, prime1 @ 4 / 9 / 9
@ 13 / 29 / 42
If we count result cycles, we get 29 cycles with `vshr`/`vshr`/`vorr`, and 27 cycles with `vshr`/`vsli`. 29/27 = 1.074. If we count writeback cycles, we get 1.135. That checks out with the 1.10x ratio I saw in the benchmark, as it lands right in that range.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56474/new/
https://reviews.llvm.org/D56474
More information about the llvm-commits
mailing list