[llvm] Add a threshold to RegStackify to avoid register spills at runtime (PR #97283)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 3 06:40:36 PDT 2024
yolanda15 wrote:
I tried to enable memory operand to the fused madd in V8. It can help partially reduce the register pressure but not all for this specific case. There's one remaining issue that seems hard for the runtime to resolve.
The original function from XNNPACK contains two stores in the loop. Both stores can generate deep stacks. But the loads used by the 2nd store cannot be moved after the 1st store, while the fused madd instructions can. Later the 1st store will create another deep stack right before itself. That greatly enlarged the live range of the previous loads crossing the whole stack of the 1st store. I Attached the dumped IR [reg-stackify-dwconv.txt](https://github.com/user-attachments/files/16085588/reg-stackify-dwconv.txt) after RegisterStackify for reference.
At runtime, we cannot move those loads across the 1st store to reduce the register pressue due to the potential side effect.
Using a stackify threshold can resolve this issue, at least shorten the live range of the distant loads. See the dumped result using a threshold [reg-stackify-dwconv-threshold.txt](https://github.com/user-attachments/files/16085686/reg-stackify-dwconv-threshold.txt).
https://github.com/llvm/llvm-project/pull/97283
More information about the llvm-commits
mailing list