[PATCH] D124118: [Peephole-Opt] For one kind of test-after-add pattern, eliminates test if it's correct to do so.
Mingming Liu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 21 01:05:00 PDT 2022
mingmingl added inline comments.
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:986
+
+ assert((SUBREG_TO_REG_MI.getOpcode() == X86::SUBREG_TO_REG) &&
+ "Should be guaranteed by caller.");
----------------
craig.topper wrote:
> SUBREG_TO_REG is only used for i32->i64 conversions. So the only TEST it could ever be is TEST64rr.
It looks like when SUBREG_TO_REG is used in SSE, the result type is not 64 bit [1]. So revised this helper function to
1) only handle the case when CmpInstr is `TEST64rr` and SUBREG_TO_REG is using `sub_32bit` as subregister index
2) returns false early for the rest of cases.
[1] https://github.com/llvm/llvm-project/blob/334522ca58aa6d6703405e479de8f5bf310ddbe2/llvm/lib/Target/X86/X86InstrSSE.td#L289
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:1043
+ if (X86::isAND(vregDefInstr->getOpcode()) ||
+ X86::isOR(vregDefInstr->getOpcode()) ||
+ X86::isXOR(vregDefInstr->getOpcode())) {
----------------
davidxl wrote:
> Is there need to handle OR and XOR here? It seems we won't see code patterns with OR followed by subreg_to_reg.
(After reading this, it occurred to me existing definitions in `td` files in `llvm/lib/Target/X86` are useful to find out how operands of SUBREG_TO-REG are generated).
Revised to handle AND only, considering
1) SUBREG_TO_REG almost always use 0 as immediate number.
2) it's hard to know the SF bit for OR, or XOR given 0 is the immediate number.
p.s. It seems [1] is where pattern (add64 -> and32 + subreg_to_reg) is given a higher priority, and I assume removing this from `td` and looking at the result could verify.
[1] https://github.com/llvm/llvm-project/blob/334522ca58aa6d6703405e479de8f5bf310ddbe2/llvm/lib/Target/X86/X86InstrCompiler.td#L1619
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:4318
+ assert(DefInstr != nullptr);
+ // {AND, OR, XOR} will update SF and won't update OF.
+ NoSignFlag = false;
----------------
craig.topper wrote:
> AND/OR/XOR set the OF flag to 0.
ah.. thanks for the catch! corrected it.
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:4319
+ // {AND, OR, XOR} will update SF and won't update OF.
+ NoSignFlag = false;
+ ClearsOverflowFlag = false;
----------------
craig.topper wrote:
> If you look through SUBREG_TO_REG then the width changed. The TEST64rr would have calculated the sign flag from bit 63(which is 0 due to the SUBREG_TO_REG), but the AND32 calculated it from bit 31 which is unknown.
thanks for the catch!
Fix it by adding helper function `isSFZero`, and only optimize away the TEST64rr when SF is known to be zero.
================
Comment at: llvm/test/CodeGen/X86/peephole-test-after-add.mir:29
+ 15: ; preds = %5
+ %16 = icmp eq i64 %10, 0
+ %17 = select i1 %16, i64 0, i64 %7
----------------
craig.topper wrote:
> Are the other basic blocks important?
Merge two IR functions into one file and by using the example from David.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124118/new/
https://reviews.llvm.org/D124118
More information about the llvm-commits
mailing list