[Mlir-commits] [mlir] [MLIR][NVGPU] Remove Memref Rank vs. Coordinates `tma.async.load` (PR #69584)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Tue Oct 24 11:54:54 PDT 2023


qcolombet wrote:

> > Instead of relaxing the check, I feel that we would need to use a `collapse_shape` on the input `memref`.
> > In particular, what would be the semantic of:
> > ```
> > nvgpu.tma.async.load %0[%c1, %c2], %1 to %2 : ... -> memref<Outerx64x128xf16, ..., 3>
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > Is `c1` applied to `Outer` or to `64` dim?
> > I think the motivating example only works because the leading dim of the input memref is 1.
> 
> It will be `Outer` (base pointer of memref).
> 
> For example:
> 
> 1. `Load 64x128 into memref<128x128>' -> verifier error -> PR will relax this
> 2. `Load 64x128 to memref<64x64>` -> verifier error -> PR will relax this
> 
> I want to allow option 1. I guess you are concerned about option-2. I can improve the verifier so it complains for option 2. Let me do that.

Well, I actually don't understand what's the semantic of option 1 :).
The thing that I'd like to avoid is having an instruction that is too powerful and hence, difficult to work with.

> 
> > > The test #69913 needs this PR.
> > 
> > 
> > Can you fold this change into the PR that needs it ?
> 
> Actually I could do this. But the test is large and requires a few more PRs. So I split them up for easy review :)



https://github.com/llvm/llvm-project/pull/69584


More information about the Mlir-commits mailing list