[llvm-dev] [CodeGen] PeepholeOptimizer: optimizing condition dependent instrunctions

Evgeny Astigeevich via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 10 02:42:31 PST 2016


Hi Quentin,

Yes, in case of physical regs SSA rules are broken and getUniqueVRegDef can not be used.

Thank you for clarifying things.
Now I have more understanding how it works.

I need to split code among these functions and cover it by tests. After that I will submit it for review.

Thanks,
Evgeny

From: Quentin Colombet [mailto:qcolombet at apple.com]
Sent: 10 March 2016 00:55
To: Evgeny Astigeevich
Cc: llvm-dev at lists.llvm.org; nd
Subject: Re: [llvm-dev] [CodeGen] PeepholeOptimizer: optimizing condition dependent instrunctions

Hi Evgeny,

On Mar 9, 2016, at 4:18 PM, Evgeny Astigeevich <Evgeny.Astigeevich at arm.com<mailto:Evgeny.Astigeevich at arm.com>> wrote:

Hi Quentin,

Yes, the code allows to process connected instructions. Although it should be taken into account that the instruction next to the current processed instruction must never be erased because this invalidates iterator.

Indeed.

I've been fixing a bug in AArch64InstrInfo::optimizeCompareInstr: instructions are converted into S form but it's not checked that they produce the same flags as CMP. The bug exists upstream as well.

Could you file a PR or just push the patch :).



Together with the fix I want to add some peephole rules for combinations CMP+BRC and CMP+SEL. In the context of optimizeCmpInstr I have all information about CmpInstr. I simply go down and check all instructions which use AArch64::NZCV whether they can be substituted with the simpler version. After all I delete CmpInstr. This approach contradicts with PeepholeOptimizer design because BRC and SEL must be processed in corresponding functions.

Ok I got your concern: basically you want to do the CMP+BRC or CMP+SEL inside optimizeCmpInstr instead of having them into optimizeSelect and optimizeBranch so that you don't do the analysis twice.

Historically the peephole optimizer is processing patterns bottom-up (use to def). The rationale is we only have one def but we may have several uses. In other words, it is easy to replace a use after you prove it is correct, but what you want is top down (def->use) and in that case, you need some extra checks (the potential other uses) to prove that the def can optimized.

The bottom line, I believe this is not done this way because it is not peephole-ish in terms of complexity.



Yes, 'analyzeCompare' is cheap but in optimizeCondBranch and in optimizeSelect we need to go up to find the instruction defining condition flags.

Going up is generally cheap, we just ask for the unique definition of the vreg. I believe in your case it is not cheap because you are tracking a physical reg and not a vreg.
Is that the problem?


In case of BRC CMP should not be far from it but I am not sure about SEL. Also when BRC is replaced with BR CMP can be removed (BTW processing of instructions below BRC can be stopped). I don't know if there any restrictions on instructions below BRC.

You should have only terminators at the end of the BB.
You may have another branch though.



Anyway I don't expect many of them. In case of CMP+SEL we can not remove CMP after simplifying SEL because there can be other SEL instructions using flags from CMP.

This is what I explained with the defs need more checks. That's why optimizeSelect seems a good fit for that.



> I have to admit I don't see the concern with the instruction being condition dependent; we don't want to call optimizeCondBranch :).
> I believe I missed your point.

I missed your point too :) I think it's always good to get rid of CondBranch.

I was talking about the code in the peephole optimizer :).
Like:
if isCondBranch then optimizeCondBranch
We don't want unconditional call to optimizeCondBranch. I.e., optimizeCondBranch expects a condbranch as argument.

Cheers,
-Quentin


We have cases like:

SUBS Wd, Wn, 0
B.LO

As SUBS sets C to 1 B.LO will fall through. So we can substitute them with an unconditional branch.

Thanks,
Evgeny

From: Quentin Colombet [mailto:qcolombet at apple.com]
Sent: 09 March 2016 18:04
To: Evgeny Astigeevich
Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>; nd
Subject: Re: [llvm-dev] [CodeGen] PeepholeOptimizer: optimizing condition dependent instrunctions

Hi Evgeny,

On Mar 9, 2016, at 6:28 AM, Evgeny Astigeevich via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

Hi,

I find it's quite strange how condition dependent instructions are processed in PeepholeOptimizer::runOnMachineFunction:

01577       if ((isUncoalescableCopy(*MI) &&
01578            optimizeUncoalescableCopy(MI, LocalMIs)) ||
01579           (MI->isCompare() && optimizeCmpInstr(MI, &MBB)) ||
01580           (MI->isSelect() && optimizeSelect(MI, LocalMIs))) {
01581         // MI is deleted.
01582         LocalMIs.erase(MI);
01583         Changed = true;
01584         continue;
01585       }
01586
01587       if (MI->isConditionalBranch() && optimizeCondBranch(MI)) {
01588         Changed = true;
01589         continue;
01590       }

CmpInstr, SelectInstr and CondBranch are processed separately. It's assumed that CmpInstr and SelectInstr are deleted but CondBranch is not.
In fact CmpInstr is always connected to SelectInstr or CondBranch or both of them. So if such connection exists it should be processed as a whole.

This code allows you to do that, unless I am mistaken.



For example, there are cases when CMP+BRC can be replaced by BR. The same is true for CMP+SEL.

I believe this should be done in respectively optimizeCondBranch and optimizeSelect.




The main problem I have is that I have to find corresponding CmpInstr and to repeat analysis of it in optimizeSelect and in optimizeCondBranch.

Ok, so basically your concern is that we may call twice analyzeCompare. Is that it?
This function is probably cheap so I wouldn't be too concerned about that. If I turn out to be wrong, then yes we can think of a better mechanism.




Any thoughts why it's implemented in such way.

The idea of the peephole optimizer is top-down approach and greedily applied optimization.

I have to admit I don't see the concern with the instruction being condition dependent; we don't want to call optimizeCondBranch :).
I believe I missed your point.

Cheers,
-Quentin




Kind regards,
Evgeny Astigeevich
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160310/56eb982a/attachment.html>


More information about the llvm-dev mailing list