[PATCH] D155851: [llvm][nvptx] Add sm_90a
guray ozen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 20 13:18:27 PDT 2023
guraypp added a comment.
Thanks for review! Having a synonym `sm_90a` is actually a big help for generating `.target sm_90a` in PTX code. `ptxas` throws an error for `wgmma` instructions with `.target sm_90`.
By the way, `wgmma` are new tensor core instructions for hopper, and they're only supported for `sm_90a`. MLIR generates them as inline assembly, and one can generate them using `asm` in CUDA (like CUTLASS).
It would be really great if we can land this workaround until we find a proper solution.
I saw that `sm_90a` requires PTX 8.0, see below:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes-ptx-release-history
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D155851/new/
https://reviews.llvm.org/D155851
More information about the llvm-commits
mailing list