[llvm] r228265 - Implement new heuristic for complete loop unrolling.

Thu Feb 12 16:04:51 PST 2015

----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Michael Zolotukhin" <mzolotukhin at apple.com>
> Cc: "Commit Messages and Patches for LLVM" <llvm-commits at cs.uiuc.edu>
> Sent: Thursday, February 12, 2015 5:49:34 PM
> Subject: Re: [llvm] r228265 - Implement new heuristic for complete loop	unrolling.
> 
> On Wed, Feb 4, 2015 at 6:34 PM, Michael Zolotukhin <
> mzolotukhin at apple.com > wrote:
> 
> 
> 
> Author: mzolotukhin
> Date: Wed Feb 4 20:34:00 2015
> New Revision: 228265
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=228265&view=rev
> Log:
> Implement new heuristic for complete loop unrolling.
> 
> Complete loop unrolling can make some loads constant, thus enabling a
> lot of other optimizations. To catch such cases, we look for loads
> that
> might become constants and estimate number of instructions that would
> be
> simplified or become dead after substitution.
> Sorry I couldn't follow along on the pre-commit review in detail, but
> I had to take a detailed look at this code today and wanted to send
> my thoughts your way.
> 
> 
> First, it would be awesome to try to more consistently stick to the
> new coding conventions when adding significant new code. Relatedly,
> please use signed types for abstract numbers that you don't expect
> 2's complement behavior with. For example, if we were to overflow
> one of these costs, we would *not* want to wrap to a tiny cost!!!
> Much better to get a UBSan failure that points at the bug.
> 
> 
> Second, the code as it stands has serious bugs computing the "dead"
> instruction cost. Let's look at EstimateNumberOfDeadInsns. This
> function considers each operand of a folded instruction, and if all
> users of that operand were folded, we add that instruction to the
> cost. But we do this for every operand. What if a single instruction
> occurs as an operand to many folded instructions? As far as I can
> tell, we'll add the cost for each use. =[
> 

Indeed. There is a CountedInsns used for preventing over-counting during the simplification phase. We likely should do the same thing there as well.

 -Hal

> 
> Also, it doesn't seem to handle circular references. Some of these
> can happen in 'normal' looking phi nodes in loops. But in
> unreachable code this could show up essentially anywhere. I think
> this causes the pass to infloop on certain inputs. But its hard to
> tell if it is an infloop, or just incredibly bad scaling that makes
> it run for "ever" on the inputs.
> 
> 
> Generally, I am *extremely* concerned with the compile time impact of
> this change. Please very carefully benchmark the compile time impact
> of this, and disable it until you can fix any compile time problems
> that turns up. There seem to be numerous places where there are
> better or smaller datastructures you could use. You call TTI methods
> inside of hot loops where it might be better to cache the cost of an
> instruction, or better, ensure we only compute it once. Also,
> walking the use-list is relatively slow (linked list causing cache
> misses) so it would be good to minimize this where possible.
> 
> 
> 
> 
> Because this is in tree after considerable review, I don't really
> want to revert it, but the issues I've pointed out above are causing
> hard failures for us. So I'm going to fix a lot of these problems
> myself, but I wanted to bring them up here so you were aware that
> they needed to be considered going forward.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory