[PATCH] D18838: [AArch64][CodeGen] Fix of incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr

Thu Apr 7 16:19:05 PDT 2016

t.p.northover added inline comments.

================
Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:1034
@@ +1033,3 @@
+    return AArch64CC::Invalid;
+
+  return UsedCondCode;
----------------
eastig wrote:
> t.p.northover wrote:
> > I think we should also check (or assert, if also checked before) that NZCV doesn't escape the basic block.
> Do you mean to check if flags are alive in successors of the basic block? If yes, this is checked in substituteCmpInstr.
Yep, that's what I meant (I noticed it later). I think an assertion is probably still a good idea somewhere in the function (as documentation that anyone reading it shouldn't bother worrying, basically).

================
Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:1043
@@ +1042,3 @@
+static bool isSUBS(unsigned Opcode) {
+  return AArch64::SUBSWri <= Opcode && Opcode <= AArch64::SUBSXrx64;
+}
----------------
eastig wrote:
> t.p.northover wrote:
> > We definitely shouldn't be relying on specific ordering like this, even if it is alphabetical.
> It's a pity. Maybe such functions already exist?
Not as far as I'm aware, but I expect an explicit "Opcode == A || Opcode == B || ..." to be just as efficient when compiled. More verbose, but less worrying.

================
Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:1094
@@ -942,3 +1093,3 @@
   const TargetRegisterInfo *TRI = &getRegisterInfo();
   if (areCFlagsAccessedBetweenInstrs(MI, CmpInstr, TRI))
     return false;
----------------
eastig wrote:
> t.p.northover wrote:
> > This completely short-circuits the other walks you're adding doesn't it? (E.g. findUsedCondCodeAfterCmp will never find more than one use).
> It checks accesses(read, write) before Cmp and after MI. It was in the original code. I think this was done in more strict way to make code simpler because such a situation is rare if it ever exists.
Yep, but to me it looks like your code already handles this properly, and would permit the optimization in more cases (albeit rare ones, as you say).

================
Comment at: test/CodeGen/AArch64/arm64-regress-opt-cmp.ll:1
@@ +1,2 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -O3 -o - %s | FileCheck %s
+
----------------
eastig wrote:
> t.p.northover wrote:
> > I think these tests are rather weak given how much code is changing and the complexity of the result. I'd probably suggest a MIR test for this one; they can be a bit temperamental to get started, but allow you to exercise many more edge cases quickly.
> I fully agree with you. I tried to write a simpler test but I failed to write it in IR. How can I write in MIR?
The hardest part last time I did it was making sure the pass was registered properly for it (hint: make sure initializeXYZPass gets called). Fortunately for you, it looks like this is already done properly for the peephole optimizer so you'd use "llc -run-pass=peephole-opts /path/to/file.mir" in the RUN line.

Other than that, to get a basic MIR file to test you could either run "llc -stop-after=some-pass" (useful if you don't know quite how to write some construct) or just copy an existing one and modify it as needed. The format is less obvious and documented than LLVM IR so writing one from fresh is not a good idea (yet, at least).


http://reviews.llvm.org/D18838