[PATCH] Added a new register class for Thumb PC-rel loads

Daniel Stewart stewartd at codeaurora.org
Wed Aug 14 08:26:57 PDT 2013


Just wanted to bump this request again to see if I can get this patch
reviewed. The explanation of the patch is below.

 

Thank you,

Daniel Stewart

--

Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by
The Linux Foundation

 

From: llvm-commits-bounces at cs.uiuc.edu
[mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Daniel Stewart
Sent: Friday, August 09, 2013 10:24 AM
To: 'Evan Cheng'
Cc: 'Commits'
Subject: RE: [PATCH] Added a new register class for Thumb PC-rel loads

 

Yes, let me expand a bit on the patch. LLVM currently allows Thumb2 code to
generate jump tables which use all rGPRs, including the link register. So it
is possible to get code that looks like the following (taken from an actual
compilation). This code is employing a jump table, where the jump is
performed by the mov pc, lr (highlighted in red below).

 

MLA      r1,r4,r0,r9

ADDS     r2,r4,#1

CMP      r0,#8

LDRSH    lr,[r3,r1,LSL #1]

LDRSH    r6,[r11],#2

SMLABB   r8,lr,r6,r8

BCC.W    0x9e82 ; matrix_mul_matrix + 442

B        0x9dfe ; matrix_mul_matrix + 310

ADR      r2,{pc}+0x28 ; 0x9d7a

ADD      r1,r2,r10,LSL #2

LDR      r11,[sp,#0x20]

STR      r1,[sp,#0x18]

MOVS     r1,#0

LDR      lr,[sp,#0x18]

STR      r1,[sp,#0x24]

MOV      r8,#0

MOVS     r7,#0

MOVS     r5,#0

MOVS     r6,#0

MOVS     r1,#0

MOVS     r4,#0

MOVS     r2,#0

MOV      pc,lr

B.W      0x9dfe ; matrix_mul_matrix + 310

B.W      0x9dfc ; matrix_mul_matrix + 308

B.W      0x9de8 ; matrix_mul_matrix + 288

B.W      0x9dd6 ; matrix_mul_matrix + 270

B.W      0x9dc4 ; matrix_mul_matrix + 252

B.W      0x9db2 ; matrix_mul_matrix + 234

B.W      0x9d96 ; matrix_mul_matrix + 206

LDR      r4,[sp,#0x24]

RSB      r10,r4,#0

AND      r5,r10,r0

ADD      r1,r5,r9

LDRSH    r2,[r3,r1,LSL #1]

 

The mov pc, lr in this code causes the processor to believe it will be
returning from the function it is in, rather than simply jumping within the
function. The processor then mispredicts, causing a flush to occur. This
patch seeks to remedy this situation be not allowing the lr register to be
used as the index in a jump table for Thumb2 code. Therefore a mov pc, lr
will not be generated for this case.

 

The other part of this patch is fixing a hard-coded reference to a
dynamically-generated variable. I discussed this issue earlier on the
LLVMDev thread (topic line - [LLVMdev] Adding a new ARM RegisterClass). This
fix is necessary for this particular patch because adding a new register
class causes the dynamically generated classes used for GPRPairs to change.
LLVM tries to find the correct intersections it needs to cover the space,
and if a register class already covers the space, it won't create a new
class. In this example jtGPR covers the same space as rGPR from the
standpoint of what is needed for the GPRPairs, as since jtGPR comes before
rGPR alphabetically, the hard-coded reference to GPRPair with gsub_1 in rGPR
is no longer created. In general, it seems preferable not to use hard-coded
references to dynamically generated variables. 

 

Daniel

--

Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by
The Linux Foundation

 

 

From: Evan Cheng [mailto:evan.cheng at apple.com] 
Sent: Thursday, August 08, 2013 7:06 PM
To: Daniel Stewart
Cc: llvm-commits at cs.uiuc.edu; Commits
Subject: Re: [PATCH] Added a new register class for Thumb PC-rel loads

 

It would make it easier to review if you can illustrate the change with some
assembly code sequence.

 

Evan

 

On Aug 8, 2013, at 7:30 AM, Daniel Stewart <stewartd at codeaurora.org> wrote:

 

This patch is for allowing a jump table (in Thumb mode) to not use the LR
register, which can branch mispredicts during a mov lr, pc. If it could be
reviewed, I'd appreciate it.

 

This patch does the following:

 

- Query for the GPRPair with gsub_1 in rGPR class.

This replaces the hard-coded reference

 

- Add jtGPR register class to be used by load PC relative

address for jump table that avoids the use of LR register.

This prevents the codegen from generating mov pc, lr

which causes the target to think it is returning

from a function and predict that the next address will

be in the calling function.

 

Daniel Stewart

--

Qualcomm Innovation Center, Inc is a member of Code Aurora Forum, hosted by
The Linux Foundation

<0001-Added-register-class-with-no-LR-used-by-PC-rel-load.patch>____________
___________________________________
llvm-commits mailing list
 <mailto:llvm-commits at cs.uiuc.edu> llvm-commits at cs.uiuc.edu
 <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130814/e1c80d7c/attachment.html>


More information about the llvm-commits mailing list