[PATCH] D142255: [WIP] Loop peeling opportunity for identity operators

Fri Jan 27 10:22:27 PST 2023

nikic added a comment.

In D142255#4086518 <https://reviews.llvm.org/D142255#4086518>, @jamieschmeiser wrote:

> I disagree that the number of times I suppressed it shows it is not profitable.  In fact, the frequency that it occurs shows its usefulness, but I suppressed it because it isn't germane to the tests in question and obfuscates their meaning and purpose.

I think my phrasing here was ambiguous: It's not the number of times it is suppressed, but if you take a look at the specific cases where it is suppressed, many of them look non-profitable to me. E.g. the second test file llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll already has a non-trivial loop body with internal control flow, where I would expect the marginal cost of more code size to already outweigh the marginal benefit of this optimization.

> The peeling code already considers the size of code growth and establishes limits to the amount of peeling that can occur.  The proposed code does not change that and only suggests the peel when it is within the established limits.

My point here was that the existing cost model is inappropriate for this transform. This transform basically saves you from executing a single instruction over the entire loop (not per iteration!) It's a very small improvement, that will be outweighed by code size increases for anything but the smallest loops. The existing cost model will perform this optimization in too many cases.

> I am preparing a revision of this code that adds a new peeling flag to the peeling control struct called allow-aggressive-peeling.  This would control opportunities that do not necessarily simplify phi structure of control flow but exist for other opportunities such as this one.  It would be set to false for full loop unroll (before vectorizing) but on for the loop unroll pass.  Like other flags in this struct, it will have an option for controlling it and it will be suppressed using the option in the loop unrolling tests for the reasons given above but not specified for other ones.  Does this approach address your concerns?  I recognize that the change would need to be examined before any decision is made...

Delaying this to runtime unrolling is certainly a good start, but I don't think it will be sufficient, and I'm not really convinced that this optimization is worth investing into in the first place. Do you have any performance data that suggests that this is really a worthwhile thing to do?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142255/new/

https://reviews.llvm.org/D142255