[llvm] [NVPTX] support packed f32 instructions for sm_100+ (PR #126337)

Princeton Ferro via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 13 04:01:25 PDT 2025


Prince781 wrote:

Hi,

I've updated the patch to legalize `v2f32` as `i64` and plumbed through the various areas. I've also introduced DAGCombine rules to ensure we still codegen `ld.vN` / `st.vN` when it makes sense to do so (when all accesses are of elements of `v2f32`). Along with some other rules for bitwise operations, this results in PTX codegen looking roughly the same if you're not using `.f32x2` instructions.

The patch probably needs more coverage and I'd also appreciate in suggestions in that direction.

https://github.com/llvm/llvm-project/pull/126337


More information about the llvm-commits mailing list