[PATCH] D20456: [CodeGen] Fix problem with X86 byte registers in CriticalAntiDepBreaker
Mitch Bodart via llvm-commits
llvm-commits at lists.llvm.org
Tue May 24 15:36:34 PDT 2016
mbodart added a comment.
I wasn't familiar with .mir files, so thanks for pointing them out!
Unfortunately creating one directly from the original pr27681.ll file yields
a .mir file which incurs various parse and semantic errors when run through
llc, so it may take some effort to get a clean .mir file. Perhaps someone
with more experience than I can whip one out quickly?
Wrt respecting sub/super regs, there are several areas in this optimization
that look suspect to me. But that doesn't necessarily mean they are incorrect.
Maybe they are, maybe they aren't but it requires further study.
As just one example, right off the bat in StartBlock, %CL is found to be
live out. So %CL, and rightfully its superregs %CX, %ECX and %RCX are
also set to a live state, but %CH is not. It certainly appears odd that
%CH's superregs are live while %CH is not, but it's technically accurate.
And it illustrates that properties of a superreg do not necessarily apply
to its subregs.
> For a proper fix we should have findSuitableFreeRegister() report false for %ch.
Wholeheartedly agree and that's what I'm trying to achieve.
The isNewRegClobberedByRefs suggestion is unfortunately insufficient.
The refs in this case are the references to the antidependent register %BL,
which does not include the instruction that defines %ECX.
I'm not familiar with LivePhysReg, so I don't know how difficult it would
be to apply here (well, both because of that unfamiliarity, and because I've
found it very difficult to associate a definitive meaning to the liveness
data structures maintained here). Fundamentally I think the shortcuts being
taken for tied operands, and registers already present in the KeepRegs set,
are a key piece of the problem. So even if LivePhysReg were employed,
those shortcuts could still lead to issues.
We need to recognize that the %ECX def clobbers %CH. Normally this appears
to be achieved in ScanInstruction by updating DefIndices[%CH] and Classes[%CH].
But that processing is skipped because of %ECX's presence in KeepRegs,
which in turn was caused by %CL having a tied instance.
I'll spend a little more time looking at those shortcuts to see if
a straightforward solution comes to mind, using DefIndices and Classes.
But if a major revamp comes into play, I likely won't have the time
to tackle it.
More information about the llvm-commits