[llvm] r228265 - Implement new heuristic for complete loop unrolling.

Thu Feb 12 16:00:32 PST 2015

Hi Chandler,

Thanks for the info, I’ll take a look. Sorry for the inconveniences.

Michael

> On Feb 12, 2015, at 3:49 PM, Chandler Carruth <chandlerc at google.com> wrote:
> 
> 
> On Wed, Feb 4, 2015 at 6:34 PM, Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote:
> Author: mzolotukhin
> Date: Wed Feb  4 20:34:00 2015
> New Revision: 228265
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=228265&view=rev <http://llvm.org/viewvc/llvm-project?rev=228265&view=rev>
> Log:
> Implement new heuristic for complete loop unrolling.
> 
> Complete loop unrolling can make some loads constant, thus enabling a
> lot of other optimizations. To catch such cases, we look for loads that
> might become constants and estimate number of instructions that would be
> simplified or become dead after substitution.
> 
> Sorry I couldn't follow along on the pre-commit review in detail, but I had to take a detailed look at this code today and wanted to send my thoughts your way.
> 
> First, it would be awesome to try to more consistently stick to the new coding conventions when adding significant new code. Relatedly, please use signed types for abstract numbers that you don't expect 2's complement behavior with. For example, if we were to overflow one of these costs, we would *not* want to wrap to a tiny cost!!! Much better to get a UBSan failure that points at the bug.
> 
> Second, the code as it stands has serious bugs computing the "dead" instruction cost. Let's look at EstimateNumberOfDeadInsns. This function considers each operand of a folded instruction, and if all users of that operand were folded, we add that instruction to the cost. But we do this for every operand. What if a single instruction occurs as an operand to many folded instructions? As far as I can tell, we'll add the cost for each use. =[
> 
> Also, it doesn't seem to handle circular references. Some of these can happen in 'normal' looking phi nodes in loops. But in unreachable code this could show up essentially anywhere. I think this causes the pass to infloop on certain inputs. But its hard to tell if it is an infloop, or just incredibly bad scaling that makes it run for "ever" on the inputs.
> 
> Generally, I am *extremely* concerned with the compile time impact of this change. Please very carefully benchmark the compile time impact of this, and disable it until you can fix any compile time problems that turns up. There seem to be numerous places where there are better or smaller datastructures you could use. You call TTI methods inside of hot loops where it might be better to cache the cost of an instruction, or better, ensure we only compute it once. Also, walking the use-list is relatively slow (linked list causing cache misses) so it would be good to minimize this where possible.
> 
> 
> Because this is in tree after considerable review, I don't really want to revert it, but the issues I've pointed out above are causing hard failures for us. So I'm going to fix a lot of these problems myself, but I wanted to bring them up here so you were aware that they needed to be considered going forward.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150212/98650ea0/attachment.html>