[PATCH] D155851: [llvm][nvptx] Add sm_90a

guray ozen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 20 13:18:27 PDT 2023


guraypp added a comment.

Thanks for review!  Having a synonym `sm_90a` is actually a big help for generating `.target sm_90a` in PTX code. `ptxas` throws an error for `wgmma` instructions with `.target sm_90`.

By the way, `wgmma` are new tensor core instructions for hopper, and they're only supported for `sm_90a`. MLIR generates them as inline assembly, and one can generate them using `asm` in CUDA (like CUTLASS).

It would be really great if we can land this workaround until we find a proper solution.

I saw that `sm_90a` requires PTX 8.0, see below:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes-ptx-release-history


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155851/new/

https://reviews.llvm.org/D155851



More information about the llvm-commits mailing list