[PATCH] D155051: [AMDGPU] Add sanity check that fixes bad shift operation in AMD backend

Konrad Kusiak via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 12 04:24:33 PDT 2023


konradkusiak97 added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:878
+
   unsigned AllowedBitsForMin = llvm::countr_zero(MaxMask);
   if ((1u << AllowedBitsForMin) <= MinMask)
----------------
foad wrote:
> JonChesterfield wrote:
> > foad wrote:
> > > JonChesterfield wrote:
> > > > konradkusiak97 wrote:
> > > > > JonChesterfield wrote:
> > > > > > Should this be if masks are equal?
> > > > > > 
> > > > > > Not sure sanity check describes the condition, maybe drop the comment
> > > > > Hm, If the masks are equal, the result will be returning `false`. I think that's the correct behaviour that the author of this function had in mind. That's based on the fact that there is `<=` and not `<` sign in `if ((1u << AllowedBitsForMin) <= MinMask)`, so it really checks if the masks overlap - and two equal masks overlap fully.
> > > > If equal masks imply they can't be combined, that should probably be true for the special case of equal masks that are zero.
> > > > 
> > > > I haven't looked at the call tree to determine what combining masks means in this context.
> > > Firstly, there is no real use case for dmask=0. The resulting instructions would be invalid. All we have to do is handle it gracefully and not crash.
> > > 
> > > > If equal masks imply they can't be combined, that should probably be true for the special case of equal masks that are zero.
> > > 
> > > I don't agree. This code is trying to check whether the //range// of bits set in one mask (from the lowest set bit to the highest //including// any gaps) overlaps with the range of set bits in the other. By that definition, two equal non-zero masks do overlap but two equal zero masks do not.
> > It's called dmasksCanBeCombined though. If equal masks return false, and the zero case doesn't matter other than some arithmetic in the compiler, then masks equal zero returning false seems reasonable. Drive by review, not a blocking comment.
> I still disagree. Returning false for all equal masks is no simpler to implement, and less easy to justify.
I'm also more directed towards returning true in the case of both masks being 0. Because you can combine two 0 masks. Whereas two equal, non-zero masks can't be trivially combined because they overlap.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155051/new/

https://reviews.llvm.org/D155051



More information about the llvm-commits mailing list