[LLVMdev] Separating loop nests based on profile information?
Xinliang David Li
xinliangli at gmail.com
Mon Jan 19 21:19:43 PST 2015
On Mon, Jan 19, 2015 at 5:55 PM, Philip Reames <listmail at philipreames.com>
wrote:
>
> On 01/12/2015 08:28 PM, Xinliang David Li wrote:
>
>
>
> On Thu, Jan 8, 2015 at 12:33 PM, Philip Reames <listmail at philipreames.com>
> wrote:
>
>>
>> Let's start with a toy example:
>> while(c) {
>> x = this->x;
>> y = this->y;
>> if (x == y) {
>> rare();
>> }
>> }
>>
>>
> With profile info, speculative PRE can be performed for memory access
> that are not downsafe:
>
> Could you define the term "downsafe"? I think I know what you mean, but
> just to be clear.
>
see the SSAPRE paper:
http://www.cse.unsw.edu.au/~cs4133/Papers/ssapre.toplas97.pdf
>
> temp_c = c;
> if (temp_c)
> {
> x = this->x;
> y = this->y;
> }
> while (temp_c) {
> if (x == y)
> {
> rare();
> x = this->x;
> y = this->y;
> }
> temp_c = c;
> }
>
> If you can prove this->x etc are safe control speculative (e.g, seen a
> dominating memory access with the same base and offset), it can be
>
> x = this->x;
> y = this->y;
> while (c) {
> if (x == y) {
> rare();
> x = this->x;
> y = this->y;
> }
> }
>
> Yep. In LLVM, this basically requires that this->FIELD be known
> deferenceable. I filled one simple bug for this here:
> http://llvm.org/bugs/show_bug.cgi?id=22266
>
What you proposed in the bug is basically profitability heuristics for
speculative PRE.
>
>
> I've also looked a more general rewrite of our PRE algorithm when a
> pointer is known dereferenceable, but I haven't figured out to judge
> profitability accurately (yet). The general approach was to just take the
> cut provided by MDA, apply "obvious" improvements (i.e. local merge
> points), the insert loads in all the unavailable blocks if it looked
> profitable.
>
See also paper "*Register promotion by sparse partial redundancy
elimination of loads and stores" for a simple way of evaluating
speculation savings.*
>
> The alternate approach did have the appeal of being "easy" in cases where
> our current approach (work backwards from load) is "hard".
>
> One challenge is in making sure the two algorithms together generate a
> stable final form. :)
>
> That last part is where I stalled. I'm trying to see what else I can do
> with simpler things before returning to that approach.
>
> Nick Lewicky also pointed out that PHITranslation is problematic in the
> current code. Oddly, I've never seen that in practice. We clearly have
> different workloads, but I haven't figured out which are the important
> different properties yet.
>
>
>
>
>
>
>> If we'd split this into a loop nest structure, we'd still have the chain,
>> but a) that's easier to control with a custom pass order since LICM and
>> LoopUnswitch are far cheaper than GVN/PRE and b) having LICM do a bit of
>> trivial loop unswitch for the terminator of the header appears quite
>> tractable.
>>
>
>
> if (c)
> {
> x = this->x;
> if (!x) return;
> y = this->y;
> if (!y) return;
> }
> while (c) {
> if (x == y) {
> rare();
> if (!x) return;
> if (!y) return;
> }
> }
>
> The 'branch PRE' (hoisting, sinking, assertion prop and branch
> elimination) part is a little tricky to do though.
>
> I think "a little tricky" might be an understatement here.
>
agreed.
>
> At least to start with, I'd probably try to handle simple cases. Even
> just doing trivial loop unswitch in the loop with LoadPRE would unlock a
> lot. (Right now, I'm just running LICM and Unswitch in a loop. It's not
> that expensive, gets a lot of cases, but doesn't get everything.)
>
Have fun with it :)
David
>
>
> Philip
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150119/7de2ba89/attachment.html>
More information about the llvm-dev
mailing list