[LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
Yin Ma
yinma at codeaurora.org
Thu Mar 14 14:43:38 PDT 2013
Hi Hal,
We are also facing the post increment problem. If a USE in a branch in a loop body,
post increment mode may not be applicable. So to estimate the real number of AddRec,
the current LSR need a little more Information to determine the mode.
Yin
-----Original Message-----
From: Hal Finkel [mailto:hfinkel at anl.gov]
Sent: Thursday, March 14, 2013 2:34 PM
To: Yin Ma
Cc: llvmdev at cs.uiuc.edu; Andrew Trick
Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
----- Original Message -----
> From: "Yin Ma" <yinma at codeaurora.org>
> To: "Andrew Trick" <atrick at apple.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Thursday, March 14, 2013 4:21:50 PM
> Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
>
>
>
>
>
> Hi Andy,
>
>
>
> Actually, if we just add hooks that preserves the existing behavior,
>
> It is not difficult. For example,
>
>
>
> For case one, we can define one function like
>
> virtual const SCEV* getTargetPreferredWinnerReg(const SCEV*&
> ScaledReg,
>
> SmallVector<const SCEV *, 4>& BaseRegs, GlobalValue*& BaseGV) const;
>
>
>
> In NarrowSearchSpaceByPickingWinnerRegs, we can preserves the winner
>
> reg from target and winner reg from the original algorithm if this
> function
>
> returns NULL, it is just like before
>
>
>
> For case two, we can define a general cost from TTI function, like
>
> virtual int getLSRFormulaCost(const unsigned NumRegs,
>
> const unsigned AddRecCost, const unsigned NumIVMuls,
>
> const unsigned NumBaseAdds, const unsigned ImmCost,
>
> const unsigned SetupCost) const;
>
> Then we do something like
>
> int thisCost = TTI->getLSRFormulaCost(NumRegs, AddRecCost, NumIVMuls,
>
> NumBaseAdds, ImmCost, SetupCost);
>
> if (thisCost >= 0) {
>
> int otherCost = TTI->getLSRFormulaCost(Other.NumRegs,
> Other.AddRecCost,
>
> Other.NumIVMuls, Other.NumBaseAdds,
>
> Other.ImmCost, Other.SetupCost);
>
> if (otherCost >= 0)
>
> return thisCost < otherCost;
>
> }
>
> In bool Cost::operator<(const Cost &Other) const
>
>
>
> We could have more decision from target backend.
>
>
>
> However, from the problem I am dealing with, which has a lot of
> branches in multiple level
>
> Loop nests. LSR is still not able to perform the best because
>
> 1. LSR is not control flow sensitive. It treats all USE equally, which
> doesn’t care which
>
> USE is on critical path and which USE is on a branch. Without these
> kind of information,
>
> We cannot predict AddRec precisely because we only can assume all USEs
> can be post
>
> Increment or all not.
>
> 2. The most occurred winner regs pruning may not be the best approach.
> Because target
>
> may prefer certain regs than others, even some registers do occur
> more. Specially,
>
> register with small computation is more likely to occur in formulas.
> However, register
>
> with small computation may not always be a best choice if the content
> in register are
>
> loop invariant.
>
>
>
> Therefore, We may need a systemic agreement or plan to address the
> existing LSR problems. I
>
> would like to ask if any party has any improvement plan about LSR? So
> we can come together
>
> to have an unified solution to handle all known problem in one round?
I am also planning to improve LSR to make it aware of pre-increment addressing. An underlying issue (as explained by Andy in the thread on "Pre-increment preparation pass" on the commits list) is that LSR will not create pointer-valued PHIs when they don't already exist. I'd be happy to work with you on this.
-Hal
>
>
>
> Thanks,
>
>
>
> Yin
>
>
>
>
>
>
>
> From: Andrew Trick [mailto:atrick at apple.com]
> Sent: Thursday, March 14, 2013 9:42 AM
> To: Yin Ma
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent
> Decision in LSR Please
>
>
>
>
>
>
>
> On Mar 13, 2013, at 4:37 PM, Yin Ma < yinma at codeaurora.org > wrote:
>
>
>
>
>
>
>
> Hi All,
>
>
>
>
>
> In the target I am working, we comes cross a situation that the loop
> strength reduction
>
>
> could deliver a better result but currently not, because
>
>
> 1. the algorithm narrows search space by winner registers without
> considering
>
>
> the target preferred format. (NarrowSearchSpaceByPickingWinnerRegs)
>
>
> 2. Cost comparison solely favors the number register without
> considering other
>
>
> Impacts.
>
>
>
>
>
> For the case one,
>
>
> NarrowSearchSpaceByPickingWinnerRegs filters by most occurred
> registers.
>
>
> ld(basereg, immediate) is a target preferred addressing mode.
> However, it may
>
>
> be deleted because basereg is very likely not to be the most occurred
> register
>
>
> because the less opportunity in a combination.
>
>
>
>
>
> For the case two, by observing the cost comparison equation
>
>
> bool Cost::operator<(const Cost &Other) const {
>
>
> if (NumRegs != Other.NumRegs) return NumRegs < Other.NumRegs;
>
>
> if (AddRecCost != Other.AddRecCost) return AddRecCost <
> Other.AddRecCost;
>
>
> if (NumIVMuls != Other.NumIVMuls) return NumIVMuls < Other.NumIVMuls;
>
>
> if (NumBaseAdds != Other.NumBaseAdds) return NumBaseAdds <
> Other.NumBaseAdds;
>
>
> if (ImmCost != Other.ImmCost) return ImmCost < Other.ImmCost;
>
>
> if (SetupCost != Other.SetupCost) return SetupCost < Other.SetupCost;
>
>
> return false;
>
>
> }
>
>
>
>
>
> If we have a case to compare
>
>
> Cost at 5 regs, with addrec cost 1, plus 15 base adds, plus 1 imm
> cost, plus 4 setup cost.
>
>
> Cost at 4 regs, with addrec cost 1, plus 28 base adds, plus 1 imm
> cost, plus 2 setup cost.
>
>
> The current mode will select 4 regs case even there are 14 more base
> adds. And base
>
>
> Adds matters in our targets.
>
>
>
>
>
> So I think the current LSR should be pushing more decision into target
> dependent backend.
>
>
> Like calling new functions in TargetTransformInfo. At least, in narrow
> search space and cost
>
>
> comparison phase, or more in cost rating phase. LSR can be tightly
> cooped with the target
>
>
> attributes in order to get the most beneficial result.
>
>
>
>
> Yes. LSR decisions are tightly coupled with the target architecture
> and potentially the subtarget microarcthitecture. As you figured out,
> the way to handle it is to communicate more information to LSR through
> TTI. It's easy to do this to solve individual benchmarks on your
> target. It's hard to know if you have a general solution that works
> across targets. But if you can add hooks in a way that preserves
> existing behavior on other targets it shouldn't be a problem. We want
> to design general hooks, but leave it up to someone doing the
> benchmarking to tune them for a particular target.
>
>
>
>
>
> -Andy
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
More information about the llvm-dev
mailing list