[PATCH] D143731: [AMDGPU] Break-up large PHIs for DAGISel

Mon Feb 13 17:04:02 PST 2023

arsenm added a comment.

In D143731#4122195 <https://reviews.llvm.org/D143731#4122195>, @Pierre-vh wrote:

> I kept the threshold at 256 for now because lower values make the transform kick more often and it's not always clear to me whether it's a positive or negative thing. Some tests have more instructions, some have a bit less, etc.
> I think 256 is good enough to get started, we can always tweak it in a separate commit if benchmarking proves that it's useful. What do you think?
>
> Also the following tests are currently broken:
>
> - `test_mfma_loop_32xfloat` in `acc-ldst.ll`: We get accvgpr read/writes instead of a load to agpr directly.

Ideally this is the kind of case RegBankSelect is supposed to solve

> I think we need an additional heuristic but I'm not sure what. Maybe we should not transform when one of the value comes from a load (we know there won't be an undef so it should be good) or comes from the same block (loop)?
> I'm also not sure if extract/insert subvector with a constant operand (in the case of zeroinit) gets folded out; do I need to special-case it?

Really this is a register bank select problem. Fundamentally we always want 32-bit phis. Any regression from not using 32-bit phis is theoretically a failure with some other optimization

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143731/new/

https://reviews.llvm.org/D143731