[PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem]
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 17 01:34:56 PDT 2020
RKSimon added inline comments.
================
Comment at: llvm/test/CodeGen/X86/vector-idiv-udiv-256.ll:577
+; AVX512BW: # %bb.0:
+; AVX512BW-NEXT: vpmovzxwd {{.*#+}} zmm0 = ymm0[0],zero,ymm0[1],zero,ymm0[2],zero,ymm0[3],zero,ymm0[4],zero,ymm0[5],zero,ymm0[6],zero,ymm0[7],zero,ymm0[8],zero,ymm0[9],zero,ymm0[10],zero,ymm0[11],zero,ymm0[12],zero,ymm0[13],zero,ymm0[14],zero,ymm0[15],zero
+; AVX512BW-NEXT: vpmulld {{.*}}(%rip){1to16}, %zmm0, %zmm0
----------------
xbolva00 wrote:
> Not sure if this avx512’s code is a improvement.
>
> @craig.topper ?
vpmulld/vpmuludq (vXi32 mul ops) are notably slower than vpmullw/vpmulhuw (vXi16) - we need to avoid this.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D68360/new/
https://reviews.llvm.org/D68360
More information about the llvm-commits
mailing list