[PATCH] D96522: [LV] Try larger VFs if VF is unprofitable for small types.

Thu Feb 11 10:52:04 PST 2021

dmgreen added a comment.

I like the idea of this. I think it often makes sense to at least try a larger vector size, even if it's over the vector width.

Is your only target for this reductions? There is some code at the moment in getSmallestAndWidestTypes that limits the widest type to loads, stores and reductions, using RdxDesc.getRecurrenceType(). We already override that for InLoopReduction as they can sometimes handle larger than legal operations natively. Does that use the original reduction size, or the MinBWs size? Does the MinBWs help here, or does the cmp block that from working, and it doesn't realize the whole loop can be calculated using an i8?

In that example, the entire loop seems like it could become `load; smin.16b, smax.16b` under AArch64. That might be difficult to make happen though.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96522/new/

https://reviews.llvm.org/D96522