[llvm] [AArch64] Fix pairing different types of registers when computing CSRs. (PR #66642)

Zhaoxuan Jiang via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 27 01:21:14 PDT 2023


nocchijiang wrote:

@kyulee-com 

> Thanks for finding and fixing this issue! Can you share some numbers on why this specialization is worth rather than bailing out since it's actually trying to handle the case where CSRs are not pairable (although the helper function names/checks are named after a pair)? I also have a few questions on this approach:

First of all let me thank you for the great work on the `AArch64LowerHomogeneousPrologEpilog` pass! I tried to enable the pass for some of the top apps from our company and noticed that it works better with Swift code. I found that Swift compiler tends to generate more functions for the same binary size compared to Clang, thus more frame setup/destroy instructions are outlined. 3.4% binary size benefit was observed in one of the tested apps which adopts Swift most heavily (roughly half of the codebase is consisted of Swift code); the respective number for other apps with much less Swift code is about 1.4%.

However, the app instantly crashed when I opened it. The crash log indicated that there were illegal instructions in the outlined helper functions. After some debugging around `AArch64FrameLowering` and the pass itself, I found that the pass was always assuming the input registers paired, while the functions with a parameter pointing to `swifterror` have a chance to create the unpaired cases. Actually these cases are not rare: by bailing out the problematic cases I observed that 0.14% binary size benefit was lost from the heavily-adopt-Swift app. This number is decent enough that it worth a proper fix in my opinion.

Moreover, I think it is fair to call it a bug, which should be addressed, for `determineCalleeSaves` trying to pair registers from different classes.

> * How is CFI unwind info generated? Is it still valid? I'm not sure how it works with the compact unwind where CSRs are expected to be paired in order (without skipping one).

As long as the compact unwind info is not generated for the unpaired cases I think we are safe. This patch is not changing the behavior of unwind info generation. Correct me if i am wrong.

> * I saw `xzr` is handled at last after skipping a missing CSR. However, saving/restoring `xzr` is unnecessary, strictly speaking. Should we align the helper name that refelects `xzr` (although it's no-op) or should we use a single load/store as opposed to a pair load/store for the last element?

I wanted to put minimized effort to make the pass work for the unpaired cases, so I decided to keep the load/store pair form. If this introduces harmful side effect that I am not aware of, please let me know. I'm OK to add `XZR` into the name if you prefer the verbosity & accurate naming.

https://github.com/llvm/llvm-project/pull/66642


More information about the llvm-commits mailing list