[LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please

Thu Mar 14 14:43:38 PDT 2013

Hi Hal,

We are also facing the post increment problem. If a USE in a branch in a loop body, 
post increment mode may not be applicable. So to estimate the real number of AddRec, 
the current LSR need a little more Information to determine the mode.

Yin

-----Original Message-----
From: Hal Finkel [mailto:hfinkel at anl.gov] 
Sent: Thursday, March 14, 2013 2:34 PM
To: Yin Ma
Cc: llvmdev at cs.uiuc.edu; Andrew Trick
Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please

----- Original Message -----
> From: "Yin Ma" <yinma at codeaurora.org>
> To: "Andrew Trick" <atrick at apple.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Thursday, March 14, 2013 4:21:50 PM
> Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent Decision in	LSR Please
> 
> 
> 
> 
> 
> Hi Andy,
> 
> 
> 
> Actually, if we just add hooks that preserves the existing behavior,
> 
> It is not difficult. For example,
> 
> 
> 
> For case one, we can define one function like
> 
> virtual const SCEV* getTargetPreferredWinnerReg(const SCEV*& 
> ScaledReg,
> 
> SmallVector<const SCEV *, 4>& BaseRegs, GlobalValue*& BaseGV) const;
> 
> 
> 
> In NarrowSearchSpaceByPickingWinnerRegs, we can preserves the winner
> 
> reg from target and winner reg from the original algorithm if this 
> function
> 
> returns NULL, it is just like before
> 
> 
> 
> For case two, we can define a general cost from TTI function, like
> 
> virtual int getLSRFormulaCost(const unsigned NumRegs,
> 
> const unsigned AddRecCost, const unsigned NumIVMuls,
> 
> const unsigned NumBaseAdds, const unsigned ImmCost,
> 
> const unsigned SetupCost) const;
> 
> Then we do something like
> 
> int thisCost = TTI->getLSRFormulaCost(NumRegs, AddRecCost, NumIVMuls,
> 
> NumBaseAdds, ImmCost, SetupCost);
> 
> if (thisCost >= 0) {
> 
> int otherCost = TTI->getLSRFormulaCost(Other.NumRegs,
> Other.AddRecCost,
> 
> Other.NumIVMuls, Other.NumBaseAdds,
> 
> Other.ImmCost, Other.SetupCost);
> 
> if (otherCost >= 0)
> 
> return thisCost < otherCost;
> 
> }
> 
> In bool Cost::operator<(const Cost &Other) const
> 
> 
> 
> We could have more decision from target backend.
> 
> 
> 
> However, from the problem I am dealing with, which has a lot of 
> branches in multiple level
> 
> Loop nests. LSR is still not able to perform the best because
> 
> 1. LSR is not control flow sensitive. It treats all USE equally, which 
> doesn’t care which
> 
> USE is on critical path and which USE is on a branch. Without these 
> kind of information,
> 
> We cannot predict AddRec precisely because we only can assume all USEs 
> can be post
> 
> Increment or all not.
> 
> 2. The most occurred winner regs pruning may not be the best approach. 
> Because target
> 
> may prefer certain regs than others, even some registers do occur 
> more. Specially,
> 
> register with small computation is more likely to occur in formulas.
> However, register
> 
> with small computation may not always be a best choice if the content 
> in register are
> 
> loop invariant.
> 
> 
> 
> Therefore, We may need a systemic agreement or plan to address the 
> existing LSR problems. I
> 
> would like to ask if any party has any improvement plan about LSR? So 
> we can come together
> 
> to have an unified solution to handle all known problem in one round?

I am also planning to improve LSR to make it aware of pre-increment addressing. An underlying issue (as explained by Andy in the thread on "Pre-increment preparation pass" on the commits list) is that LSR will not create pointer-valued PHIs when they don't already exist. I'd be happy to work with you on this.

 -Hal

> 
> 
> 
> Thanks,
> 
> 
> 
> Yin
> 
> 
> 
> 
> 
> 
> 
> From: Andrew Trick [mailto:atrick at apple.com]
> Sent: Thursday, March 14, 2013 9:42 AM
> To: Yin Ma
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent 
> Decision in LSR Please
> 
> 
> 
> 
> 
> 
> 
> On Mar 13, 2013, at 4:37 PM, Yin Ma < yinma at codeaurora.org > wrote:
> 
> 
> 
> 
> 
> 
> 
> Hi All,
> 
> 
> 
> 
> 
> In the target I am working, we comes cross a situation that the loop 
> strength reduction
> 
> 
> could deliver a better result but currently not, because
> 
> 
> 1. the algorithm narrows search space by winner registers without 
> considering
> 
> 
> the target preferred format. (NarrowSearchSpaceByPickingWinnerRegs)
> 
> 
> 2. Cost comparison solely favors the number register without 
> considering other
> 
> 
> Impacts.
> 
> 
> 
> 
> 
> For the case one,
> 
> 
> NarrowSearchSpaceByPickingWinnerRegs filters by most occurred 
> registers.
> 
> 
> ld(basereg, immediate) is a target preferred addressing mode.
> However, it may
> 
> 
> be deleted because basereg is very likely not to be the most occurred 
> register
> 
> 
> because the less opportunity in a combination.
> 
> 
> 
> 
> 
> For the case two, by observing the cost comparison equation
> 
> 
> bool Cost::operator<(const Cost &Other) const {
> 
> 
> if (NumRegs != Other.NumRegs) return NumRegs < Other.NumRegs;
> 
> 
> if (AddRecCost != Other.AddRecCost) return AddRecCost < 
> Other.AddRecCost;
> 
> 
> if (NumIVMuls != Other.NumIVMuls) return NumIVMuls < Other.NumIVMuls;
> 
> 
> if (NumBaseAdds != Other.NumBaseAdds) return NumBaseAdds < 
> Other.NumBaseAdds;
> 
> 
> if (ImmCost != Other.ImmCost) return ImmCost < Other.ImmCost;
> 
> 
> if (SetupCost != Other.SetupCost) return SetupCost < Other.SetupCost;
> 
> 
> return false;
> 
> 
> }
> 
> 
> 
> 
> 
> If we have a case to compare
> 
> 
> Cost at 5 regs, with addrec cost 1, plus 15 base adds, plus 1 imm 
> cost, plus 4 setup cost.
> 
> 
> Cost at 4 regs, with addrec cost 1, plus 28 base adds, plus 1 imm 
> cost, plus 2 setup cost.
> 
> 
> The current mode will select 4 regs case even there are 14 more base 
> adds. And base
> 
> 
> Adds matters in our targets.
> 
> 
> 
> 
> 
> So I think the current LSR should be pushing more decision into target 
> dependent backend.
> 
> 
> Like calling new functions in TargetTransformInfo. At least, in narrow 
> search space and cost
> 
> 
> comparison phase, or more in cost rating phase. LSR can be tightly 
> cooped with the target
> 
> 
> attributes in order to get the most beneficial result.
> 
> 
> 
> 
> Yes. LSR decisions are tightly coupled with the target architecture 
> and potentially the subtarget microarcthitecture. As you figured out, 
> the way to handle it is to communicate more information to LSR through 
> TTI. It's easy to do this to solve individual benchmarks on your 
> target. It's hard to know if you have a general solution that works 
> across targets. But if you can add hooks in a way that preserves 
> existing behavior on other targets it shouldn't be a problem. We want 
> to design general hooks, but leave it up to someone doing the 
> benchmarking to tune them for a particular target.
> 
> 
> 
> 
> 
> -Andy
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>