[llvm] [NVPTX] support packed f32 instructions for sm_100+ (PR #126337)
Princeton Ferro via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 13 04:01:25 PDT 2025
Prince781 wrote:
Hi,
I've updated the patch to legalize `v2f32` as `i64` and plumbed through the various areas. I've also introduced DAGCombine rules to ensure we still codegen `ld.vN` / `st.vN` when it makes sense to do so (when all accesses are of elements of `v2f32`). Along with some other rules for bitwise operations, this results in PTX codegen looking roughly the same if you're not using `.f32x2` instructions.
The patch probably needs more coverage and I'd also appreciate in suggestions in that direction.
https://github.com/llvm/llvm-project/pull/126337
More information about the llvm-commits
mailing list