[PATCH] D45673: [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite uses across basic blocks in the limited cases where it is very straight forward to do so.

Thu Apr 19 01:29:02 PDT 2018

chandlerc added inline comments.

================
Comment at: llvm/trunk/lib/Target/X86/X86FlagsCopyLowering.cpp:445-457
+      if (&TestMBB != &UseMBB && !MDT->dominates(&TestMBB, &UseMBB)) {
+        DEBUG({
+          dbgs() << "ERROR: Encountered use that is not dominated by our test "
+                    "basic block! Rewriting this would require inserting PHI "
+                    "nodes to track the flag state across the CFG.\n\nTest "
+                    "block:\n";
+          TestMBB.dump();
----------------
mzolotukhin wrote:
> chandlerc wrote:
> > mzolotukhin wrote:
> > > Could all of this be under `DEBUG`? The fatal error we're reporting here looks like an internal compiler assert anyway.
> > > If we put in entirely under `DEBUG`, we can avoid building dominators in Release compilers.
> > The reason we've kept a number of these as fatal errors is due to our concern about latent problems in LLVM. We're essentially introducing a pretty novel set of constraints on what kinds of EFLAGS copies can be lowered, and trying to get somewhat more reliable failure modes if this bites us...
> > 
> > And at least one of these already helped uncover a bug (that this patch is fixing)...
> > 
> > Is the domtree analysis cost showing up significantly in compile times for you? Are there other concerns?
> I see a value of having it on at least for some time, but I think in the long run we should turn it into an assert. Would it be possible?
> 
> I just noticed a new pass in the pipeline and wanted to check if it can be avoided. On O0 dom-tree analysis costs ~0.25% geomean (on CTMark), so each separate invocation doesn't seem to cost much, but they add up eventually.
> 
> By the way, why didn't we catch the bug you are talking about with RA compiler bots? Maybe that's an argument for adding more RA bots?
Sorry, I misunderstood. Long term, we can almost certainly move all of this to an assert-only kind of check. We just are benefiting (for now) from more aggressive checking.

But you ask a great question: why don't the release+asserts deployments catch these?

The frustrating thing is that some of the patterns here are *really* rare, or really tricky to reproduce. One of the test cases we got only reproduces when you build the input with debug info enabled! We still don't know why. The other test case only reproduces on 32-bit x86, where we (sadly) have much less testing.

It's also somewhat rare for benchmarks to hit this because the old lowering was *really* slow. If it were happening frequently in benchmarks, it might have been fixed sooner or the benchmarks would have been changed to not do that.... =/

Repository:
  rL LLVM

https://reviews.llvm.org/D45673