[PATCH] D19005: CodeGen, AArch64, ARM, X86: Simplify SplitCSR

Tue Apr 12 12:53:45 PDT 2016

MatzeB added a comment.

In http://reviews.llvm.org/D19005#398802, @manmanren wrote:

> Some general comments:
>  Is this NFC (no functionality change)? I noticed a small testing case change.

It changes the order in which the physregs are copied to vregs in the entry block. We are unlucky in the testcase and the register allocator chooses registers in a way that fewer stmia instructions are formed, the code gets slightly bigger and we need a wide jump.
I can make this true NFC but that will mean changing the code to add some registers in reverse order without any good explanation of why...

> I think the original code tries to make it easy for a target to start supporting splitCSR, or for a calling convention to start using splitCSR.

>  With this updated approach, is it still easy to do so? I didn't really look through the patch to figure out how :]

The main motivation of the change is that the decision on whether to use SplitCSR had to be done before any other SelectionDAG code, this meant that there was no CCState object available and you cannot create one because ISD::InputArgs is not available. So you had to make that decisions purely on the IR function. The commit depending on this wants to use SplitCSR for all cases when a parameter is passed in a callee save register. Without CCState available you can only match on calling convention IDs and you would need to hardcode knowledge about how registers will be chosen later based on that, doing it later you can just take the information that is already computed.

To use SplitCSR with this change you basically: Decide whether you need it in LowerFormals() and if so call addLiveIn() for every register handled this way, you also alter LowerReturn to copy the register back. I found this easier to understand as that closely follows the logic that you copy the register to a vreg in the entry block and copy it back in the return block. LowerFormals() and LowerReturn() should already be familiar to a person implementing the target. The problem I had with a series of target callback is that to understand the program flow you need to read the generic SelectionDAG code to find out when those callbacks are called. The fact that the new code is 169 lines less is also 
an indication that it is simpler.

Repository:
  rL LLVM

http://reviews.llvm.org/D19005