[llvm] [NVPTX] Vectorize and lower 256-bit global loads/stores for sm_100+/ptx88+ (PR #139292)

Fri May 9 16:19:40 PDT 2025

AlexMaclean wrote:

> I think I addressed all current feedback, although the removal of the VecType operand is causing some tests to fail, I'll need to investigate what I'm missing on Monday but saving my progress for now.

Unfortunately, we have lots of brittle code which uses hard-coded operand indices for various machine instructions. Here is one example I personally am guilty of introducing:

https://github.com/llvm/llvm-project/blob/2da57f8105f0faff5cb7d671307f7cfc7ff2dce4/llvm/lib/Target/NVPTX/NVPTXForwardParams.cpp#L108-L109

There are likely a few other places where this occurs (not all of which are as nicely called out), which you'll need to update now that the load and store instructions have one fewer operand. You might want to look at when the last time an operand was added what changed. 

> Also, @AlexMaclean do you know if NVPTXForwardParams.cpp need to be updated to include the v8 opcodes?

I don't think that would make sense. This pass is about replacing `ld.local` with `ld.param` so I think with the current address-space restrictions it is impossible to ever encounter the v8 opcodes. If you can construct a test that demonstrates I'm incorrect adding v8 sounds good to me. 

https://github.com/llvm/llvm-project/pull/139292