[PATCH] D124118: [Peephole-Opt] For one kind of test-after-add pattern, eliminates test if it's correct to do so.

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 21 01:05:00 PDT 2022


mingmingl added inline comments.


================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:986
+
+  assert((SUBREG_TO_REG_MI.getOpcode() == X86::SUBREG_TO_REG) &&
+         "Should be guaranteed by caller.");
----------------
craig.topper wrote:
> SUBREG_TO_REG is only used for i32->i64 conversions. So the only TEST it could ever be is TEST64rr.
It looks like when SUBREG_TO_REG is used in SSE, the result type is not 64 bit [1]. So revised this helper function to 
1) only handle the case when CmpInstr is `TEST64rr` and SUBREG_TO_REG is using `sub_32bit` as subregister index
2) returns false early for the rest of cases.

[1] https://github.com/llvm/llvm-project/blob/334522ca58aa6d6703405e479de8f5bf310ddbe2/llvm/lib/Target/X86/X86InstrSSE.td#L289


================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:1043
+  if (X86::isAND(vregDefInstr->getOpcode()) ||
+      X86::isOR(vregDefInstr->getOpcode()) ||
+      X86::isXOR(vregDefInstr->getOpcode())) {
----------------
davidxl wrote:
> Is there need to handle OR and XOR here? It seems we won't see code patterns with OR followed by subreg_to_reg.
(After reading this, it occurred to me existing definitions in `td` files in `llvm/lib/Target/X86` are useful to find out how operands of SUBREG_TO-REG are generated).

Revised to handle AND only, considering
1) SUBREG_TO_REG almost always use 0 as immediate number.
2) it's hard to know the SF bit for OR, or XOR given 0 is the immediate number.

p.s. It seems [1] is where pattern (add64 -> and32 + subreg_to_reg) is given a higher priority, and I assume removing this from `td` and looking at the result could verify.

[1] https://github.com/llvm/llvm-project/blob/334522ca58aa6d6703405e479de8f5bf310ddbe2/llvm/lib/Target/X86/X86InstrCompiler.td#L1619




================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:4318
+          assert(DefInstr != nullptr);
+          // {AND, OR, XOR} will update SF and won't update OF.
+          NoSignFlag = false;
----------------
craig.topper wrote:
> AND/OR/XOR set the OF flag to 0.
ah.. thanks for the catch! corrected it.


================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:4319
+          // {AND, OR, XOR} will update SF and won't update OF.
+          NoSignFlag = false;
+          ClearsOverflowFlag = false;
----------------
craig.topper wrote:
> If you look through SUBREG_TO_REG then the width changed. The TEST64rr would have calculated the sign flag from bit 63(which is 0 due to the SUBREG_TO_REG), but the AND32 calculated it from bit 31 which is unknown.
thanks for the catch!

Fix it by adding helper function `isSFZero`, and only optimize away the TEST64rr when SF is known to be zero.


================
Comment at: llvm/test/CodeGen/X86/peephole-test-after-add.mir:29
+  15:                                               ; preds = %5
+    %16 = icmp eq i64 %10, 0
+    %17 = select i1 %16, i64 0, i64 %7
----------------
craig.topper wrote:
> Are the other basic blocks important?
Merge two IR functions into one file and by using the example from David.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124118/new/

https://reviews.llvm.org/D124118



More information about the llvm-commits mailing list