[PATCH] D136102: [LoopSimplify] Update loop-metadata ID after loop-simplify splitting out new outer loop

Wed Oct 26 08:03:13 PDT 2022

Narutoworld abandoned this revision.
Narutoworld added inline comments.

================
Comment at: llvm/lib/Transforms/Utils/LoopSimplify.cpp:336

+  // Create new metadata ID for NewOuter loop
+  // keep other metadata attched before
----------------
bjope wrote:
> bjope wrote:
> > Meinersbur wrote:
> > > bjope wrote:
> > > > Narutoworld wrote:
> > > > > bjope wrote:
> > > > > > I'm no expert on the Loop metadata handling. But in some sense I'd consider the outer loop as the original loop (so keeping the old metadata on that loop seem correct, e.g. if it has been vectorizer, or if user has forced vectorization of that original loop etc).
> > > > > > 
> > > > > > However, I'm not sure if the metadata should be kept for the inner loop here. That is a sub-loop, and not sure what an optimization report should say about that loop (that was created by the compiler). And if the user has requested unrolling of the original loop with a factor X, and this loop simplify pass is run before unrolling, we now indicate that we want both the outer and inner loop to be unrolled by a factor X. That seem wrong to me. So maybe it would be more correct to just drop all metadata for the inner loop (except for the loop id).
> > > > > > 
> > > > > > On the other hand, hypothetically, if the metadata already indicates that the loop has been vectorized/unrolled etc, then we probably want to mark the inner loop as already having been optimized to avoid that it is optimized again. Or is that irrelevant here?
> > > > > Thanks for your comment.
> > > > > 
> > > > > For the given testcase, the outer loop needs the `llvm.loop.usefulinfo` metadata not the inner loop, which happens to have it only because they were a single loop. 
> > > > > 
> > > > > However, I don't think we can generalize this. And in fact, the `loop-simplify` pass tries to create the outer loop and keeps the remaining loop as before. It might be better to drop all metadata if we cannot prove they are valid. 
> > > > > 
> > > > > I think it should be the vectorizer's job to mark a loop that has already been optimized . 
> > > > > I think it should be the vectorizer's job to mark a loop that has already been optimized .
> > > > 
> > > > That is true. But the question is how a pass like this should behave if in the input
> > > > a) a loop already is marked as having been vectorized/unrolled etc
> > > > b) a loop is marked as "force vectorize" or "force unroll X times" etc
> > > > 
> > > > In case (a) I think the metadata probably should be kept on the outer loop (and possibly added to the inner loops) The optimization has already been done and we do not want further vectorization/unrolling on any of the sub loops.
> > > > 
> > > > In case (b) I think the metadata probably should remain on the outer loop (which in this case is the full original loop?). But it should not be added to the inner loop (or else we might get unrolling by a factor X*X etc).
> > > > 
> > > > Anyway. I believe it would be correct to keep the original metadata on the outer loop.
> > > > Not quite sure about which metadata to put on the inner loop. Dropping all metadata (except the loop id) for the inner loop is perhaps good enough for this patch to just sort out the current issue that we get same loop id on both inner and outer loop, and I think adding the outer loops metadata to the inner loop could be worse than dropping the metadata on the inner loop.
> > > > 
> > > > So in other words, I think the patch should be updated to get this kind of semantics:
> > > > - replicate the original loops metadata to the new outer loop
> > > > - assign a new loop id to the inner loop
> > > “LoopID” is a misnomer, the same LoopID can refer to the same loop. Otherwise every transformation that clones code (unroll, inline, LoopVersioning, LoopUnswitch, ...) whout have to specifically handle any llvm.loop that the cloned code contains. See [[ https://llvm.org/docs/LangRef.html#llvm-loop | LLVM Language Reference Manual ]]. 
> > > 
> > > I think the default behavior is just applying the same LoopID to both loops such as analytical information is preserved (e.g. `isvectorized`, `parallel_accesses`, ...). It may be argued this is not safe for all metadata since neither loops is quite the original and need to have a list of known loop metadata with the info whether is should apply to the inner loop (`llvm.loop.unroll.enable`, ..), the outer loop (?, can't think of one where this makes most sense), or both loops (`isvectorized`, `parallel_accesses`, `llvm.loop.licm.disable`, all the disable metadata, ...). and discard all unknown.
> > > 
> > > Some metadata definitely have different semantics when applied the outer loop (only). For instance, when (partial) unrolling an outer loop, it would give multiple copies of the inner loop (which makes the code more inefficient as it would most likely exceed the L1i cache), but not jam like `llvm.loop.unroll_and_jam` might do. `llvm.loop.unroll.disable` would still allow unrolling the inner loop, which is what the heuristic would most likely unroll.
> > Alright, so the LoopID is not used as some sort of key in LoopInfo to identify a single loop. It just identifies the metadata associated with the loop (and several different loops could have identical metadata and share the same LoopID.
> > 
> > In practice I think that LoopVersioning, LoopUnswitch etc (almost?) always update the loop metadata in some way when versioning a loop (either using setLoopID directly, or using some helper such as addStringMetadataToLoop(), setLoopAlreadyUnrolled()). But given that text in the reference manual about llvm.loop not being unique to a single loop I guess that there are no guarantees for that (nor that it is enforced).
> > 
> > This reminds me of an old problem when stale llvm.loop was put on a non-latch branch after some transformation (or not applied to all latches). Then suddenly after some other transformations that llvm.loop accidentally happened to end up on a latch for a totally different loop. I think we wanted to add some protection in the verifier to detect misplaced llvm.loop metadata. but if I remember correctly that was complicated (maybe due to llvm.loop not being unique to a single loop so the we can't really say that it is misplaced unless it is in a non-latch block, or maybe it was just complicated to verify for other reasons).
> But then maybe https://github.com/llvm/llvm-project/issues/57603 isn't a bug then. If having the same llvm.loop id on both loops is OK, and even expected for this particular transform.
> 
> Downstream metadata such as "llvm.loop.usefulinfo" can of course not be updated by the upstream. So if that only apply to one or none of the loops after this transform the downstream maintainer need to find out where to hack into the code and add/remove metadata every time a loop is versioned/transformed. Usually we can look for places when setLoopID/addStringMetadataToLoop/setLoopAlreadyUnrolled/makePostTransformationMetadata etc are used to find such places.
> But given the information that it is allowed to reuse the old loopID for new loops even when the loop shape is changed (such as number of iterations) there could also be a number of "hidden" scenarios when a new loop just get the same loopID as an existing loop even if it is a child-loop etc. Those will be much harder to find.
> At least we know where we need to add a piece of code downstream for this particular case, to make a post transform metadata update for the loops involved related to "llvm.loop.usefulinfo".
Thanks @Meinersbur @bjope for your comment.

I think we agree on two points.
  # Since the loopID is not guaranteed to identify a unique loop. In this case, both loops can share the same loopID. There is no need to update LoopID metadata
  # To handle downstream feature such as "llvm.loop.usefulinfo", downstream code needs to handle it whenever a loop is transformed by passes such as `loopVersioning`, `unroll` or etc.

Since https://github.com/llvm/llvm-project/issues/57603 isn't a bug, I will close this patch. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136102/new/

https://reviews.llvm.org/D136102