[PATCH] D66069: [X86] Use PSADBW for v8i8 addition reductions.

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 12 07:32:05 PDT 2019


craig.topper added a comment.

That doesn’t seem profitable for v2i8. We’d be better off extracting both elements and doing a scalar add. For v4i8, I’m not sure. Psadbw is 5 cycles on some CPUs if I remember right, the normal expansion is probably faster on those CPUs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66069/new/

https://reviews.llvm.org/D66069





More information about the llvm-commits mailing list