[PATCH] D86262: [LoopIdiomRecognizePass] Options to disable part or the entire Loop Idiom Recognize Pass

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 20 08:33:49 PDT 2020


fhahn added a comment.

In D86262#2228456 <https://reviews.llvm.org/D86262#2228456>, @bmahjour wrote:

>> I agree with that, it seems to be better to improve DA. Is it feasible?
>
> The theory of data dependence analysis relies on presence of subscripts in array references to be able to produce accurate results. I don't see how we can "improve DA" to address memset/memcpies short of turning them back into loop nests before applying the dependence tests. To do that the loop has to either be materialized before the DA analysis pass is run, or somehow SCEV expressions representing the implied subscripts be synthesized out of thin air. The former must be achieved by a transformation pass, so we would have to turn memset/memcpys into loop nests as soon as possible. For memset/memcpy calls generated by the loop idiom pass, the ideal place for that transformation would be immediately after loop idiom itself, which would have the same effect as preventing loop idiom from creating such loops in the first place when it knows they are not profitable. I don't know of any possible way to do the latter.

IIUC LoopIdiom will effectively remove a loop and replace it with a memset/memcopy. So we should have the same information, just in different forms: loop that writes successive memory locations or a single call that we know writes to the same locations. I think it would be good to have a concrete motivating example that highlights what exactly goes wrong.

> I agree with @fhahn that this is more of a cost-modeling issue. I think the cost-modeling would have to rely heavily on loop tripcount data which, in the general case, is only available through PGO, so an option to disable it for users who don't want to use PGO makes sense to me.

I don't think the pass creating memset/memcpy that are not profitable is a problem only for 'highly tuned libraries'. It is a problem for any code. I would argue if we cannot prove that it is likely to be profitable to optimize. So if we do not know the trip count (or do not have a good estimate), we should not create memsets/memcpys. Again, a concrete motivating example would be helpful.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86262/new/

https://reviews.llvm.org/D86262



More information about the llvm-commits mailing list