[Mlir-commits] [mlir] [mlir] GEMM Hopper Tensor Core Integration Test (PR #81478)

Guray Ozen llvmlistbot at llvm.org
Sun Mar 3 08:36:23 PST 2024


https://github.com/grypp commented:



> *[Reviewable](https://reviewable.io/reviews/llvm/llvm-project/81478)* status: 0 of 5 files reviewed, 24 unresolved discussions (waiting on @joker-eph, @jpienaar, @manishucsd, and @vinodgro)

___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py` line 44 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3UejX5sGoVinffKFD:-Ns3nCET6_BR0sxjhpTe:b-rsjy6y) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py#L44)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

pthread is rather engrained in folks. How about just main?
</blockquote></details>

took the terminology of"primary thread" t from openmp. But pthread shortcut isn't the best. 
Let me think about that. 
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py` line 128 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3V6Eq1wlTTIdLiehO:-Ns3nm2z3MVskmK7O5-r:b-clex4l) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py#L128)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Why is this needed?
</blockquote></details>

Doesn't this verifies the module? I am not sure whether the module is verified before?
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py` line 132 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3VLqf3yhVAZhVuQo2:-Ns3o4WTBcYSJFCo0N0A:b-896fix) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py#L132)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Rm commented out code
</blockquote></details>

Done.
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py` line 140 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3VQuT9AX4i_tSIgT3:-Ns3oaQm8ilX0RQo9LpW:buinezm) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py#L140)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Should this be an option to the function?
</blockquote></details>

Good point, I can do that, but this code is entirely for hopper gpus. So I don't expect we will need any other compilation flag. 
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py` line 147 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3V_JN2lNiv22mDB26:-Ns3oNVnEtUmNtksQ-ln:b-503sxr) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py#L147)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Does this opt_level have to match the one above?
</blockquote></details>

This one is for the host code while another one is for the device code. They don't necessarily need to match
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py` line 256 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3WAZZ4mGN5l73dX8S:-Ns3pGhJ7Q1MGlzsv5cl:bdrk40q) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py#L256)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Are both ran unconditionally? (it feels like one should be part of integration tests while the other could be for unit test)
</blockquote></details>

I agree with you but note that llvm test machines don't have hopper gpus :) so it won't run any of these tests.
I put this code as an example. 
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py` line 259 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3Vpnf5sIXqh0fgkQv:-Ns3pndyBxJQsFdPEitX:b-8s5mfx) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/matmul.py#L259)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Should gtest parameterized be used here?
</blockquote></details>

How can I use that?
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py` line 80 at r1](https://reviewable.io/reviews/llvm/llvm-project/81478#gh-1495101386:-Ns3n7k18TWYJUpUF4L4:b-896fix) ([raw file](https://github.com/llvm/llvm-project/blob/aa3e3486ef011cd4bc9bb784f81b73981e913151/mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py#L80)):*
<details><summary><i>Previously, manishucsd (Manish Gupta) wrote…</i></summary><blockquote>

ok. I see that, even though we have these come in as arguments to the function `generate_matmul_ws`, but we are generating it for a fixed datatype. 
</blockquote></details>

Done.
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py` line 83 at r1](https://reviewable.io/reviews/llvm/llvm-project/81478#gh-1495104150:-Ns3n5qWArAXabRJLXra:b-896fix) ([raw file](https://github.com/llvm/llvm-project/blob/aa3e3486ef011cd4bc9bb784f81b73981e913151/mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py#L83)):*
<details><summary><i>Previously, manishucsd (Manish Gupta) wrote…</i></summary><blockquote>

Sounds good! 

For this PR, should we add asserts on (BLOCK_M, BLOCK_N, BLOCK_K) == (128,128,64)?
</blockquote></details>

Done.
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py` line 25 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3WZE41bAAf4japw1y:-Ns3zdv6CQPm6N6eiYgP:b-896fix) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py#L25)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Document ? (also could you use the the constant from C/C++ side)
</blockquote></details>

Done.
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py` line 61 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3X2_rBmnPWg3vg4aY:-Ns3zf6r63YmBbezB9HW:b-896fix) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py#L61)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

There has to be something already for this. What do you get if you str on the type?
</blockquote></details>

Done.
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py` line 76 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3XAoR2TBryuxuaGvE:-Ns3zfXEB3srymWhJMxj:b-896fix) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py#L76)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

The FloatType recently got added to query the width, so you could collapse the fp ones
</blockquote></details>

Done.
___
*[`mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py` line 694 at r3](https://reviewable.io/reviews/llvm/llvm-project/81478#-Ns3Z1rD66RXZialj0YK:-Ns3zQQz9OuCYFHfzzkw:b-896fix) ([raw file](https://github.com/llvm/llvm-project/blob/6da82f445e66be30e31674aaf9aa7f075ac02735/mlir/test/Integration/GPU/CUDA/sm90/python/tools/matmulBuilder.py#L694)):*
<details><summary><i>Previously, jpienaar (Jacques Pienaar) wrote…</i></summary><blockquote>

Same for these (well most parse instances)
</blockquote></details>

Done.


<!-- Sent from Reviewable.io -->


https://github.com/llvm/llvm-project/pull/81478


More information about the Mlir-commits mailing list