[PATCH] D66069: [X86] Use PSADBW for v8i8 addition reductions.
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 12 07:32:05 PDT 2019
craig.topper added a comment.
That doesn’t seem profitable for v2i8. We’d be better off extracting both elements and doing a scalar add. For v4i8, I’m not sure. Psadbw is 5 cycles on some CPUs if I remember right, the normal expansion is probably faster on those CPUs.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D66069/new/
https://reviews.llvm.org/D66069
More information about the llvm-commits
mailing list