[PATCH] D44338: [LV][VPlan] Build plain CFG with simple recipes for outer loops.
Hideki Saito via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 16 13:02:50 PDT 2018
hsaito added inline comments.
================
Comment at: lib/Transforms/Vectorize/VPlanHCFGBuilder.cpp:100
+ if (isa<LoadInst>(Inst) || isa<StoreInst>(Inst)) {
+ VPBB->appendRecipe(
+ new VPWidenMemoryInstructionRecipe(Inst, nullptr /*Mask*/));
----------------
a.elovikov wrote:
> For outer loop vectorization in
>
> int s = 0;
> for (int i = 0; i < N; ++i) {
> for (int j = 0; j < M; ++j) {
> s += x[i] * y[j];
> }
> }
>
> We need a broadcast y[j] -> {y[j], y[j], y[j], y[j]} but this will generate a WIDEN recipe for the load. Is that OK? If so, can we document it somewhere?
>
Reference: LoopVectorizationPlanner::tryToWidenMemory().
VPWidenMemoryRecipe can handle CM_GatherScatter and uniform can be thought of as a special form of gather/scatter. From that perspective, it is okay.
A vector load/store is deemed gather/scatter until analysis improves it to a better access type. From that perspective, using "generic gather/scatter" during the initial VPlan construction phase makes perfect sense.
If we are building a single VPlan CFG for inner and/or outer loop vectorization (and that's something we should be doing if HCFG look identical), we can't encode "memory access kind" information within HCFG. So, keeping it in "generic gather/scatter" at HCFG level is the right thing to do for the long term also.
In other words, we need a storage outside of HCFG to house "uniform/unit-stride/interleave/..." information for the load/store.
Repository:
rL LLVM
https://reviews.llvm.org/D44338
More information about the llvm-commits
mailing list