[PATCH] D54042: [AMDGPU] Extend the SI Load/Store optimizer to combine more things.

Marek Olšák via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 2 13:39:51 PDT 2018


mareko added a comment.

I'm concerned that x8 and x16 loads will significantly increase SGPR usage and therefore SGPR spilling. We have a shader database with over 70 games and benchmarks and I guess the results will not be good after this is committed.

There is another case that can be optimized: Loading {f32, f32, skip, f32} and {f32, skip, f32, f32}. Those can be done with x4 loads for both scalar and vector instructions. The cost is 1 more used VGPR or SGPR. Also, register allocation may reuse the unused register immediately, which will cause unnecessary s_waitcnt after the load and may hurt us.


Repository:
  rL LLVM

https://reviews.llvm.org/D54042





More information about the llvm-commits mailing list