[PATCH] D132700: [DSE] Add value type checks for masked store candidates in Dead Store Elimination

Michael Berg via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 1 11:45:51 PDT 2022


mcberg2021 added inline comments.


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:251
+    if (KillingII->getArgOperand(0)->getType() !=
+        DeadII->getArgOperand(0)->getType())
+      return OW_Unknown;
----------------
mcberg2021 wrote:
> craig.topper wrote:
> > mcberg2021 wrote:
> > > fhahn wrote:
> > > > mcberg2021 wrote:
> > > > > fhahn wrote:
> > > > > > StephenFan wrote:
> > > > > > > Is it possible to relax the comparison from type to type size?
> > > > > > Yeah it would be good to check the stored sizes instead of requiring the types to match. I think we should remove one store if both masked stores are flipped.
> > > > > Well, we have an ulterior motive, VLA uses VL with type to constrain size.
> > > > Right, so  it sounds like this should also 1) have tests with scalable vectors and 2) different handling for scalable vs non-scalable vectors?
> > > We would like to keep this patch to just the fixed non-scalable vectors like in this example.  I can add the type size and I think element count would also be useful, as it is within context of this intrinsic.
> > @fhahn what does "flipped" mean in your comment. "I think we should remove one store if both masked stores are flipped."
> > 
> > I think there was a confusion about was meant by scalable here. When @mcberg2021 said scalable that was referring to a different intrinsic, vp_store, that we will also need to handle here eventually. When @fhahn said scalable that was referring to this intrinsic with a scalable vector type.
> > 
> > My suggestion is to check `getElementCount` and that `getScalarSizeInBits` for the type match. That will cover both fixed and scalable.
> > 
> > Alternatively, since the masks are the same, the element count is guaranteed the same. So we could check only that `getPrimitiveSizeInBits()` is the same.
> Query: What does "flipped" mean?
> 
> Right now I think we cover both fixed and scalable for masked_store with a type size check and ElementCount check being the same for both stores as is it precludes mixing fixed and scalable and will work for either scenario.
I'm going to venture a reasonable guess for flipped: It's the test example turned around from:

  tail call void @llvm.masked.store.v4i32.p0(<4 x i32> %v2, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)
  tail call void @llvm.masked.store.v4i8.p0(<4 x i8> %v1, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)

to

  tail call void @llvm.masked.store.v4i8.p0(<4 x i8> %v1, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)
  tail call void @llvm.masked.store.v4i32.p0(<4 x i32> %v2, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)

where if you applied a mask to supply the superset for the second store it would be : 1111 1111 1111 1111
and the first store would accordingly be a well formed subset that fits under it:            0001 0001 0001 0001

meaning the store is fully contained in a constant mask known at compile time.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132700/new/

https://reviews.llvm.org/D132700



More information about the llvm-commits mailing list