[PATCH] D98240: [VectorCombine] Simplify to scalar store if only one element updated

Sun May 30 05:01:19 PDT 2021

fhahn added a comment.

In D98240#2782010 <https://reviews.llvm.org/D98240#2782010>, @hgreving wrote:

>> Interesting! I guess the code assumes that a scalar load is always possible & at least as cheap as the vector version. But I think it would make sense to ask the cost-model if that's the case. Not sure if it would be possible to test this with an in-tree target?
>
> Hi thanks for getting back to me. I'm not sure if it's a cost model question, a straight-up disable switch for not morphing vector derefs into scalar might be better? Is there anything else in this pass that might do that? Unfortunately yes, I think there's no proper upstream target with this constraint. Though I am guessing I am not the only downstream target with a vector memory like that. The problem with trying to make this work is that I am worried about what happens to the pointer. Will I always be able to rely on that it will be aligned, probably not...

I guess it depends on whether the backend can legalize/convert back to a vector load. In general, relying on the middle-end to no scalarize those loads for correctness seems a bit fragile. I'm not sure if it makes sense to add a TTI hook that's not used by any in tree targets.

@qiucf would you be able to look into extending the code to check for the cost?

================
Comment at: llvm/lib/Transforms/Vectorize/VectorCombine.cpp:810-813
+    StoreInst *NSI = Builder.CreateStore(NewElement, GEP);
+    NSI->copyMetadata(*SI);
+    if (SI->getAlign() < NSI->getAlign())
+      NSI->setAlignment(SI->getAlign());
----------------
lebedev.ri wrote:
> I'm certain this is a miscompile.
> The alignment that was valid for a vector store
> is not guaranteed to be valid for the store of a single vector element,
> unless it's the 0'th element of course.
> 
> I think this needs to be something like
> ```
> newalign = 1;
> if(auto*C = dyn_cast<ConstantInt>(Idx)) {
>   newalign = max(old store align, old load align);
>   newalign = commonAlignment(newalign, Idx * DL.getTypeSize(NewElement.getType()))
> }
> ```
That's a good point. @qiucf can you look into this?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98240/new/

https://reviews.llvm.org/D98240