[PATCH] D142602: [X86] Expand transform (icmp eq/ne (ABS A), C) -> (and/or (icmp eq/ne A, C), (icmp eq/ne A, -C))
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Feb 11 19:59:05 PST 2023
pengfei added inline comments.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:53465
+ !Subtarget.hasAVX512()) {
+ // If ABS(vNxi64) requires avx512 even for xmm/ymm wereas SETCC/ALU
+ // are available with (sse2/sse4.1)/avx2. If ABS it not available,
----------------
whereas
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:53465
+ !Subtarget.hasAVX512()) {
+ // If ABS(vNxi64) requires avx512 even for xmm/ymm wereas SETCC/ALU
+ // are available with (sse2/sse4.1)/avx2. If ABS it not available,
----------------
pengfei wrote:
> whereas
The comment is not clear to me. Can you refactor it?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:53466
+ // If ABS(vNxi64) requires avx512 even for xmm/ymm wereas SETCC/ALU
+ // are available with (sse2/sse4.1)/avx2. If ABS it not available,
+ // check if SETCC/ALU are, and if so, fold.
----------------
is
================
Comment at: llvm/test/CodeGen/X86/icmp-abs-C-vec.ll:104-107
+; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [18446744073709551487,18446744073709551487,18446744073709551487,18446744073709551487]
+; AVX2-NEXT: vpcmpeqq %ymm2, %ymm0, %ymm2
; AVX2-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
+; AVX2-NEXT: vpor %ymm2, %ymm0, %ymm0
----------------
I doubt if this is beneficial. The transform neither reduces instructions nor improves throughput, but it introduces extra memory load. WDYT?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D142602/new/
https://reviews.llvm.org/D142602
More information about the llvm-commits
mailing list