[llvm] [X86] Fold `(icmp ult (add x,-C),2)` -> `(or (icmp eq X,C), (icmp eq X,C+1))` for Vectors (PR #84104)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 6 10:27:59 PST 2024
- Previous message: [llvm] [X86] Fold `(icmp ult (add x,-C),2)` -> `(or (icmp eq X,C), (icmp eq X,C+1))` for Vectors (PR #84104)
- Next message: [llvm] [X86] Fold `(icmp ult (add x,-C),2)` -> `(or (icmp eq X,C), (icmp eq X,C+1))` for Vectors (PR #84104)
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
================
@@ -53408,6 +53409,64 @@ static SDValue combineSetCC(SDNode *N, SelectionDAG &DAG,
truncateAVX512SetCCNoBWI(VT, OpVT, LHS, RHS, CC, DL, DAG, Subtarget))
return R;
+ // In the middle end transforms:
+ // `(or (icmp eq X, C), (icmp eq X, C+1))`
+ // -> `(icmp ult (add x, -C), 2)`
+ // Likewise inverted cases with `ugt`.
+ //
+ // Since x86, pre avx512, doesn't have unsigned vector compares, this results
+ // in worse codegen. So, undo the middle-end transform and go back to `(or
+ // (icmp eq), (icmp eq))` form.
+ //
+ // NB: We don't handle the similiar simplication of `(and (icmp ne), (icmp
+ // ne))` as it doesn't end up instruction positive.
+ // TODO: We might want to do this for avx512 as well if we `sext` the result.
+ if (VT.isVector() && OpVT.isVector() && OpVT.isInteger() &&
+ ISD::isUnsignedIntSetCC(CC) && LHS.getOpcode() == ISD::ADD &&
+ !Subtarget.hasAVX512() && LHS.hasOneUse()) {
+
+ APInt CmpC;
+ SDValue AddC = LHS.getOperand(1);
+ if (ISD::isConstantSplatVector(RHS.getNode(), CmpC) &&
+ DAG.isConstantIntBuildVectorOrConstantInt(AddC)) {
+ // See which form we have depending on the constant/condition.
+ SDValue C0 = SDValue();
+ SDValue C1 = SDValue();
+
+ // If we had `(add x, -1)` and can lower with `umin`, don't transform as
+ // we will end up generating an additional constant. Keeping in the
+ // current form has a slight latency cost, but it probably worth saving a
+ // constant.
+ if (ISD::isConstantSplatVectorAllOnes(AddC.getNode()) &&
+ DAG.getTargetLoweringInfo().isOperationLegal(ISD::UMIN, OpVT)) {
----------------
goldsteinn wrote:
Yeah, seems this isn't beneficial with `AVX1`. End up with more `vinsertf128`. Ill just drop transform for 256-bit + AVX
https://github.com/llvm/llvm-project/pull/84104
- Previous message: [llvm] [X86] Fold `(icmp ult (add x,-C),2)` -> `(or (icmp eq X,C), (icmp eq X,C+1))` for Vectors (PR #84104)
- Next message: [llvm] [X86] Fold `(icmp ult (add x,-C),2)` -> `(or (icmp eq X,C), (icmp eq X,C+1))` for Vectors (PR #84104)
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the llvm-commits
mailing list