[PATCH] Added a new register class for Thumb PC-rel loads
Jim Grosbach
grosbach at apple.com
Tue Aug 27 12:12:37 PDT 2013
Nit:
+def jtGPR : RegisterClass<"ARM", [i32], 32, (add (sequence "R%u", 0, 12))>;
You can define this like rGPR does, in terms of GPR (or rGPR) and a “sub” operator to get a bit more clarity about what’s going on. Something like:
def jtGPR : RegisterClass<"ARM", [i32], 32, (sub rGPR, LR)>;
Also please add a comment on the register class describing its purpose. The mispredict issue, as opposed to a correctness issue, is a bit subtle and if it’s not explicitly documented, it’ll be quickly forgotten why this register class is there in the first place.
-Jim
On Aug 27, 2013, at 3:17 AM, Renato Golin <renato.golin at linaro.org> wrote:
> Hi Daniel,
>
> The patch looks good to me, though I'd make sure Evan agrees before committing.
>
> Would also be nice to get a test, if possible.
>
> cheers,
> --renato
>
>
> On 14 August 2013 16:26, Daniel Stewart <stewartd at codeaurora.org> wrote:
> Just wanted to bump this request again to see if I can get this patch reviewed. The explanation of the patch is below.
>
>
>
> Thank you,
>
> Daniel Stewart
>
> --
>
> Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by The Linux Foundation
>
>
>
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Daniel Stewart
> Sent: Friday, August 09, 2013 10:24 AM
> To: 'Evan Cheng'
> Cc: 'Commits'
> Subject: RE: [PATCH] Added a new register class for Thumb PC-rel loads
>
>
>
> Yes, let me expand a bit on the patch. LLVM currently allows Thumb2 code to generate jump tables which use all rGPRs, including the link register. So it is possible to get code that looks like the following (taken from an actual compilation). This code is employing a jump table, where the jump is performed by the mov pc, lr (highlighted in red below).
>
>
>
> MLA r1,r4,r0,r9
>
> ADDS r2,r4,#1
>
> CMP r0,#8
>
> LDRSH lr,[r3,r1,LSL #1]
>
> LDRSH r6,[r11],#2
>
> SMLABB r8,lr,r6,r8
>
> BCC.W 0x9e82 ; matrix_mul_matrix + 442
>
> B 0x9dfe ; matrix_mul_matrix + 310
>
> ADR r2,{pc}+0x28 ; 0x9d7a
>
> ADD r1,r2,r10,LSL #2
>
> LDR r11,[sp,#0x20]
>
> STR r1,[sp,#0x18]
>
> MOVS r1,#0
>
> LDR lr,[sp,#0x18]
>
> STR r1,[sp,#0x24]
>
> MOV r8,#0
>
> MOVS r7,#0
>
> MOVS r5,#0
>
> MOVS r6,#0
>
> MOVS r1,#0
>
> MOVS r4,#0
>
> MOVS r2,#0
>
> MOV pc,lr
>
> B.W 0x9dfe ; matrix_mul_matrix + 310
>
> B.W 0x9dfc ; matrix_mul_matrix + 308
>
> B.W 0x9de8 ; matrix_mul_matrix + 288
>
> B.W 0x9dd6 ; matrix_mul_matrix + 270
>
> B.W 0x9dc4 ; matrix_mul_matrix + 252
>
> B.W 0x9db2 ; matrix_mul_matrix + 234
>
> B.W 0x9d96 ; matrix_mul_matrix + 206
>
> LDR r4,[sp,#0x24]
>
> RSB r10,r4,#0
>
> AND r5,r10,r0
>
> ADD r1,r5,r9
>
> LDRSH r2,[r3,r1,LSL #1]
>
>
>
> The mov pc, lr in this code causes the processor to believe it will be returning from the function it is in, rather than simply jumping within the function. The processor then mispredicts, causing a flush to occur. This patch seeks to remedy this situation be not allowing the lr register to be used as the index in a jump table for Thumb2 code. Therefore a mov pc, lr will not be generated for this case.
>
>
>
> The other part of this patch is fixing a hard-coded reference to a dynamically-generated variable. I discussed this issue earlier on the LLVMDev thread (topic line - [LLVMdev] Adding a new ARM RegisterClass). This fix is necessary for this particular patch because adding a new register class causes the dynamically generated classes used for GPRPairs to change. LLVM tries to find the correct intersections it needs to cover the space, and if a register class already covers the space, it won’t create a new class. In this example jtGPR covers the same space as rGPR from the standpoint of what is needed for the GPRPairs, as since jtGPR comes before rGPR alphabetically, the hard-coded reference to GPRPair with gsub_1 in rGPR is no longer created. In general, it seems preferable not to use hard-coded references to dynamically generated variables.
>
>
>
> Daniel
>
> --
>
> Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by The Linux Foundation
>
>
>
>
>
> From: Evan Cheng [mailto:evan.cheng at apple.com]
> Sent: Thursday, August 08, 2013 7:06 PM
> To: Daniel Stewart
> Cc: llvm-commits at cs.uiuc.edu; Commits
> Subject: Re: [PATCH] Added a new register class for Thumb PC-rel loads
>
>
>
> It would make it easier to review if you can illustrate the change with some assembly code sequence.
>
>
>
> Evan
>
>
>
> On Aug 8, 2013, at 7:30 AM, Daniel Stewart <stewartd at codeaurora.org> wrote:
>
>
>
> This patch is for allowing a jump table (in Thumb mode) to not use the LR register, which can branch mispredicts during a mov lr, pc. If it could be reviewed, I’d appreciate it.
>
>
>
> This patch does the following:
>
>
>
> - Query for the GPRPair with gsub_1 in rGPR class.
>
> This replaces the hard-coded reference
>
>
>
> - Add jtGPR register class to be used by load PC relative
>
> address for jump table that avoids the use of LR register.
>
> This prevents the codegen from generating mov pc, lr
>
> which causes the target to think it is returning
>
> from a function and predict that the next address will
>
> be in the calling function.
>
>
>
> Daniel Stewart
>
> --
>
> Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by The Linux Foundation
>
> <0001-Added-register-class-with-no-LR-used-by-PC-rel-load.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130827/cf91c442/attachment.html>
More information about the llvm-commits
mailing list