[PATCH] D96522: [LV] Try larger VFs if VF is unprofitable for small types.

Thu Feb 11 11:43:45 PST 2021

fhahn added a comment.

In D96522#2557608 <https://reviews.llvm.org/D96522#2557608>, @dmgreen wrote:

> I like the idea of this. I think it often makes sense to at least try a larger vector size, even if it's over the vector width.

Agreed! ideally we would be confident enough in the cost-model to more liberally explore larger VFs and trust the cost-model to pick the best one, even in the presence of larger ones. I think this patch is a small first step in this direction, with potential to extend this to more cases in the future.

> Is your only target for this reductions? There is some code at the moment in getSmallestAndWidestTypes that limits the widest type to loads, stores and reductions, using RdxDesc.getRecurrenceType(). We already override that for InLoopReduction as they can sometimes handle larger than legal operations natively. Does that use the original reduction size, or the MinBWs size? Does the MinBWs help here, or does the cmp block that from working, and it doesn't realize the whole loop can be calculated using an i8?

I just realized the example I choose was more complex than it needed to be. The main initial focus is narrow loads & stores, but it also applies to reductions. I don't think `MinBWs` helps directly with deciding whether to try bigger vector factors (the IR test cases do not have reductions and have either a load or store to `i32`).

> In that example, the entire loop seems like it could become `load; smin.16b, smax.16b` under AArch64. That might be difficult to make happen though.

Yeah,  I'm planning to take a look if we can use `MinBWs` more aggressively in the reduction case as follow-up.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96522/new/

https://reviews.llvm.org/D96522