[PATCH] Added a new register class for Thumb PC-rel loads

Eric Christopher echristo at gmail.com
Fri Aug 30 13:27:09 PDT 2013


No testcase?

-eric

On Fri, Aug 30, 2013 at 1:14 PM, Daniel Stewart <stewartd at codeaurora.org> wrote:
> I updated the patch per Jim’s suggestions.
>
>
>
> If this is acceptable, could someone commit it for me, as I do not have
> permissions.
>
>
>
> Thanks!
>
>
>
> Daniel
>
> --
>
> Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by
> The Linux Foundation
>
>
>
>
>
> From: Evan Cheng [mailto:evan.cheng at apple.com]
> Sent: Tuesday, August 27, 2013 4:21 PM
> To: Jim Grosbach
> Cc: Renato Golin; Daniel Stewart; Commits
>
>
> Subject: Re: [PATCH] Added a new register class for Thumb PC-rel loads
>
>
>
> LGTM after fix Jim's concerns.
>
>
>
> Evan
>
> On Aug 27, 2013, at 12:12 PM, Jim Grosbach <grosbach at apple.com> wrote:
>
>
>
> Nit:
>
>
>
> +def jtGPR : RegisterClass<"ARM", [i32], 32, (add (sequence "R%u", 0, 12))>;
>
>
>
> You can define this like rGPR does, in terms of GPR (or rGPR) and a “sub”
> operator to get a bit more clarity about what’s going on. Something like:
>
>
>
> def jtGPR : RegisterClass<"ARM", [i32], 32, (sub rGPR, LR)>;
>
>
>
> Also please add a comment on the register class describing its purpose. The
> mispredict issue, as opposed to a correctness issue, is a bit subtle and if
> it’s not explicitly documented, it’ll be quickly forgotten why this register
> class is there in the first place.
>
>
>
> -Jim
>
>
>
> On Aug 27, 2013, at 3:17 AM, Renato Golin <renato.golin at linaro.org> wrote:
>
>
>
> Hi Daniel,
>
>
>
> The patch looks good to me, though I'd make sure Evan agrees before
> committing.
>
>
>
> Would also be nice to get a test, if possible.
>
>
>
> cheers,
>
> --renato
>
>
>
> On 14 August 2013 16:26, Daniel Stewart <stewartd at codeaurora.org> wrote:
>
> Just wanted to bump this request again to see if I can get this patch
> reviewed. The explanation of the patch is below.
>
>
>
> Thank you,
>
> Daniel Stewart
>
> --
>
> Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by
> The Linux Foundation
>
>
>
> From: llvm-commits-bounces at cs.uiuc.edu
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Daniel Stewart
> Sent: Friday, August 09, 2013 10:24 AM
> To: 'Evan Cheng'
> Cc: 'Commits'
> Subject: RE: [PATCH] Added a new register class for Thumb PC-rel loads
>
>
>
> Yes, let me expand a bit on the patch. LLVM currently allows Thumb2 code to
> generate jump tables which use all rGPRs, including the link register. So it
> is possible to get code that looks like the following (taken from an actual
> compilation). This code is employing a jump table, where the jump is
> performed by the mov pc, lr (highlighted in red below).
>
>
>
> MLA      r1,r4,r0,r9
>
> ADDS     r2,r4,#1
>
> CMP      r0,#8
>
> LDRSH    lr,[r3,r1,LSL #1]
>
> LDRSH    r6,[r11],#2
>
> SMLABB   r8,lr,r6,r8
>
> BCC.W    0x9e82 ; matrix_mul_matrix + 442
>
> B        0x9dfe ; matrix_mul_matrix + 310
>
> ADR      r2,{pc}+0x28 ; 0x9d7a
>
> ADD      r1,r2,r10,LSL #2
>
> LDR      r11,[sp,#0x20]
>
> STR      r1,[sp,#0x18]
>
> MOVS     r1,#0
>
> LDR      lr,[sp,#0x18]
>
> STR      r1,[sp,#0x24]
>
> MOV      r8,#0
>
> MOVS     r7,#0
>
> MOVS     r5,#0
>
> MOVS     r6,#0
>
> MOVS     r1,#0
>
> MOVS     r4,#0
>
> MOVS     r2,#0
>
> MOV      pc,lr
>
> B.W      0x9dfe ; matrix_mul_matrix + 310
>
> B.W      0x9dfc ; matrix_mul_matrix + 308
>
> B.W      0x9de8 ; matrix_mul_matrix + 288
>
> B.W      0x9dd6 ; matrix_mul_matrix + 270
>
> B.W      0x9dc4 ; matrix_mul_matrix + 252
>
> B.W      0x9db2 ; matrix_mul_matrix + 234
>
> B.W      0x9d96 ; matrix_mul_matrix + 206
>
> LDR      r4,[sp,#0x24]
>
> RSB      r10,r4,#0
>
> AND      r5,r10,r0
>
> ADD      r1,r5,r9
>
> LDRSH    r2,[r3,r1,LSL #1]
>
>
>
> The mov pc, lr in this code causes the processor to believe it will be
> returning from the function it is in, rather than simply jumping within the
> function. The processor then mispredicts, causing a flush to occur. This
> patch seeks to remedy this situation be not allowing the lr register to be
> used as the index in a jump table for Thumb2 code. Therefore a mov pc, lr
> will not be generated for this case.
>
>
>
> The other part of this patch is fixing a hard-coded reference to a
> dynamically-generated variable. I discussed this issue earlier on the
> LLVMDev thread (topic line - [LLVMdev] Adding a new ARM RegisterClass). This
> fix is necessary for this particular patch because adding a new register
> class causes the dynamically generated classes used for GPRPairs to change.
> LLVM tries to find the correct intersections it needs to cover the space,
> and if a register class already covers the space, it won’t create a new
> class. In this example jtGPR covers the same space as rGPR from the
> standpoint of what is needed for the GPRPairs, as since jtGPR comes before
> rGPR alphabetically, the hard-coded reference to GPRPair with gsub_1 in rGPR
> is no longer created. In general, it seems preferable not to use hard-coded
> references to dynamically generated variables.
>
>
>
> Daniel
>
> --
>
> Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by
> The Linux Foundation
>
>
>
>
>
> From: Evan Cheng [mailto:evan.cheng at apple.com]
> Sent: Thursday, August 08, 2013 7:06 PM
> To: Daniel Stewart
> Cc: llvm-commits at cs.uiuc.edu; Commits
> Subject: Re: [PATCH] Added a new register class for Thumb PC-rel loads
>
>
>
> It would make it easier to review if you can illustrate the change with some
> assembly code sequence.
>
>
>
> Evan
>
>
>
> On Aug 8, 2013, at 7:30 AM, Daniel Stewart <stewartd at codeaurora.org> wrote:
>
>
>
> This patch is for allowing a jump table (in Thumb mode) to not use the LR
> register, which can branch mispredicts during a mov lr, pc. If it could be
> reviewed, I’d appreciate it.
>
>
>
> This patch does the following:
>
>
>
> - Query for the GPRPair with gsub_1 in rGPR class.
>
> This replaces the hard-coded reference
>
>
>
> - Add jtGPR register class to be used by load PC relative
>
> address for jump table that avoids the use of LR register.
>
> This prevents the codegen from generating mov pc, lr
>
> which causes the target to think it is returning
>
> from a function and predict that the next address will
>
> be in the calling function.
>
>
>
> Daniel Stewart
>
> --
>
> Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by
> The Linux Foundation
>
> <0001-Added-register-class-with-no-LR-used-by-PC-rel-load.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>




More information about the llvm-commits mailing list