[all-commits] [llvm/llvm-project] a29066: [AArch64][SVE] Fix bad PTEST(X, X) optimization
Cullen Rhodes via All-commits
all-commits at lists.llvm.org
Tue Nov 15 03:59:52 PST 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: a290668ec5478e26d765a01d254b10d13c2a1dbd
https://github.com/llvm/llvm-project/commit/a290668ec5478e26d765a01d254b10d13c2a1dbd
Author: Cullen Rhodes <cullen.rhodes at arm.com>
Date: 2022-11-15 (Tue, 15 Nov 2022)
Changed paths:
M llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
M llvm/test/CodeGen/AArch64/sve-ptest-removal-cmple.ll
M llvm/test/CodeGen/AArch64/sve-ptest-removal-whilegt.mir
Log Message:
-----------
[AArch64][SVE] Fix bad PTEST(X, X) optimization
AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a
predicate generating operation that identically sets flags (implictly).
When the mask is the same as the input predicate the PTEST is currently
removed. This is incorrect since the mask for the implicit PTEST
performed by the flag-setting instruction differs from the mask
specified to the explicit PTEST and could set different flags.
For example, consider
PG=<1, 1, x, x>
Z0=<1, 2, x, x>
Z1=<2, 1, x, x>
X=CMPLE(PG, Z0, Z1)
=<0, 1, x, x> NZCV=0xxx
PTEST(X, X), NZCV=1xxx
where the first active flag (bit 'N' in NZCV) is set by the explicit
PTEST, but not by the implicit PTEST as part of the compare. Given the
PTEST mask and source are the same however, first is equivalent to any,
so the PTEST could be removed if the condition is changed. The same
applies to last active. It is safe to remove the PTEST for any active,
but this information isn't available in the current optimization.
This patch fixes the bad optimization, a later patch will implement the
optimization proposed above and fix the any active case.
Reviewed By: bsmith
Differential Revision: https://reviews.llvm.org/D137717
More information about the All-commits
mailing list