<div dir="ltr"><div>This cuts the revert down to just the 3 affected files, if that is more acceptable.</div><div><br></div><div>Thanks,</div><div>Steve</div><div><br></div><div>From 0c86082f6c3010da95176d5ed1be8921952fff04 Mon Sep 17 00:00:00 2001<br>
</div><div><div>From: Stephen Hines <<a href="mailto:srhines@google.com">srhines@google.com</a>></div><div>Date: Thu, 17 Jul 2014 00:48:53 -0700</div><div>Subject: [PATCH] Revert Thumb-2 conversion of some ARM builtins.</div>
<div><br></div><div>The udivmodsi4/modsi3/umodsi3 code computes jump targets based on ARM encodings</div><div>(if CLZ is present and IDIV is not present).</div><div><br></div><div>Reverts parts of r211032 and r211035.</div>
<div>---</div><div> lib/builtins/arm/udivmodsi4.S | 16 +++-------------</div><div> lib/builtins/arm/udivsi3.S    | 21 +++------------------</div><div> lib/builtins/arm/umodsi3.S    | 18 ++----------------</div><div> 3 files changed, 8 insertions(+), 47 deletions(-)</div>
<div><br></div><div>diff --git a/lib/builtins/arm/udivmodsi4.S b/lib/builtins/arm/udivmodsi4.S</div><div>index 5b4e1bc..ddc8752 100644</div><div>--- a/lib/builtins/arm/udivmodsi4.S</div><div>+++ b/lib/builtins/arm/udivmodsi4.S</div>
<div>@@ -16,9 +16,6 @@</div><div> </div><div> <span class="" style="white-space:pre"> </span>.syntax unified</div><div> <span class="" style="white-space:pre">  </span>.text</div><div>-#if __ARM_ARCH_ISA_THUMB == 2</div><div>
-<span class="" style="white-space:pre">      </span>.thumb</div><div>-#endif</div><div> </div><div> <span class="" style="white-space:pre">    </span>.p2align 2</div><div> DEFINE_COMPILERRT_FUNCTION(__udivmodsi4)</div><div>@@ -99,16 +96,9 @@ DEFINE_COMPILERRT_FUNCTION(__udivmodsi4)</div>
<div> </div><div> #define<span class="" style="white-space:pre">  </span>IMM<span class="" style="white-space:pre">       </span>#</div><div> </div><div>-#if __ARM_ARCH_ISA_THUMB == 2</div><div>-#define ITT itt</div><div>-#else</div>
<div>-#define ITT @</div><div>-#endif</div><div>-</div><div>-#define block(shift)                                                           \</div><div>-<span class="" style="white-space:pre">   </span>cmp<span class="" style="white-space:pre">       </span>r0, r1, lsl IMM shift;                                         \</div>
<div>-<span class="" style="white-space:pre">   </span>ITT hs;                                                                \</div><div>-<span class="" style="white-space:pre">  </span>addhs<span class="" style="white-space:pre">     </span>r3, r3, IMM (1 << shift);                                      \</div>
<div>+#define block(shift) \</div><div>+<span class="" style="white-space:pre">     </span>cmp<span class="" style="white-space:pre">       </span>r0, r1, lsl IMM shift; \</div><div>+<span class="" style="white-space:pre">  </span>addhs<span class="" style="white-space:pre">     </span>r3, r3, IMM (1 << shift); \</div>
<div> <span class="" style="white-space:pre">  </span>subhs<span class="" style="white-space:pre">     </span>r0, r0, r1, lsl IMM shift</div><div> </div><div> <span class="" style="white-space:pre">       </span>block(31)</div><div>
diff --git a/lib/builtins/arm/udivsi3.S b/lib/builtins/arm/udivsi3.S</div><div>index 8695257..8fb1dca 100644</div><div>--- a/lib/builtins/arm/udivsi3.S</div><div>+++ b/lib/builtins/arm/udivsi3.S</div><div>@@ -16,17 +16,6 @@</div>
<div> </div><div> <span class="" style="white-space:pre"> </span>.syntax unified</div><div> <span class="" style="white-space:pre">  </span>.text</div><div>-#if __ARM_ARCH_ISA_THUMB == 2</div><div>-<span class="" style="white-space:pre">        </span>.thumb</div>
<div>-#endif</div><div>-</div><div>-#if __ARM_ARCH_ISA_THUMB == 2</div><div>-#define IT  it</div><div>-#define ITT itt</div><div>-#else</div><div>-#define IT  @</div><div>-#define ITT @</div><div>-#endif</div><div> </div>
<div> <span class="" style="white-space:pre">  </span>.p2align 2</div><div> DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_uidiv, __udivsi3)</div><div>@@ -40,12 +29,9 @@ DEFINE_COMPILERRT_FUNCTION(__udivsi3)</div><div> <span class="" style="white-space:pre">  </span>bx  <span class="" style="white-space:pre">     </span>lr</div>
<div> #else</div><div> <span class="" style="white-space:pre">    </span>cmp<span class="" style="white-space:pre">       </span>r1, #1</div><div>-<span class="" style="white-space:pre">    </span>IT cc</div><div> <span class="" style="white-space:pre">    </span>bcc<span class="" style="white-space:pre">       </span>LOCAL_LABEL(divby0)</div>
<div>-<span class="" style="white-space:pre">   </span>IT eq</div><div> <span class="" style="white-space:pre">    </span>JMPc(lr, eq)</div><div> <span class="" style="white-space:pre">     </span>cmp<span class="" style="white-space:pre">       </span>r0, r1</div>
<div>-<span class="" style="white-space:pre">   </span>ITT cc</div><div> <span class="" style="white-space:pre">   </span>movcc<span class="" style="white-space:pre">     </span>r0, #0</div><div> <span class="" style="white-space:pre">   </span>JMPc(lr, cc)</div>
<div> <span class="" style="white-space:pre">  </span>/*</div><div>@@ -108,10 +94,9 @@ DEFINE_COMPILERRT_FUNCTION(__udivsi3)</div><div> </div><div> #define<span class="" style="white-space:pre">       </span>IMM<span class="" style="white-space:pre">       </span>#</div>
<div> </div><div>-#define block(shift)                                                           \</div><div>-<span class="" style="white-space:pre">     </span>cmp<span class="" style="white-space:pre">       </span>r0, r1, lsl IMM shift;                                         \</div>
<div>-<span class="" style="white-space:pre">   </span>ITT hs;                                                                \</div><div>-<span class="" style="white-space:pre">  </span>addhs<span class="" style="white-space:pre">     </span>r3, r3, IMM (1 << shift);                                      \</div>
<div>+#define block(shift) \</div><div>+<span class="" style="white-space:pre">     </span>cmp<span class="" style="white-space:pre">       </span>r0, r1, lsl IMM shift; \</div><div>+<span class="" style="white-space:pre">  </span>addhs<span class="" style="white-space:pre">     </span>r3, r3, IMM (1 << shift); \</div>
<div> <span class="" style="white-space:pre">  </span>subhs<span class="" style="white-space:pre">     </span>r0, r0, r1, lsl IMM shift</div><div> </div><div> <span class="" style="white-space:pre">       </span>block(31)</div><div>
diff --git a/lib/builtins/arm/umodsi3.S b/lib/builtins/arm/umodsi3.S</div><div>index c690df3..164646b 100644</div><div>--- a/lib/builtins/arm/umodsi3.S</div><div>+++ b/lib/builtins/arm/umodsi3.S</div><div>@@ -16,17 +16,6 @@</div>
<div> </div><div> <span class="" style="white-space:pre"> </span>.syntax unified</div><div> <span class="" style="white-space:pre">  </span>.text</div><div>-#if __ARM_ARCH_ISA_THUMB == 2</div><div>-<span class="" style="white-space:pre">        </span>.thumb</div>
<div>-#endif</div><div>-</div><div>-#if __ARM_ARCH_ISA_THUMB == 2</div><div>-#define IT  it</div><div>-#define ITT itt</div><div>-#else</div><div>-#define IT  @</div><div>-#define ITT @</div><div>-#endif</div><div> </div>
<div> <span class="" style="white-space:pre">  </span>.p2align 2</div><div> DEFINE_COMPILERRT_FUNCTION(__umodsi3)</div><div>@@ -41,11 +30,9 @@ DEFINE_COMPILERRT_FUNCTION(__umodsi3)</div><div> #else</div><div> <span class="" style="white-space:pre">    </span>cmp<span class="" style="white-space:pre">       </span>r1, #1</div>
<div> <span class="" style="white-space:pre">  </span>bcc<span class="" style="white-space:pre">       </span>LOCAL_LABEL(divby0)</div><div>-<span class="" style="white-space:pre">       </span>ITT eq</div><div> <span class="" style="white-space:pre">   </span>moveq<span class="" style="white-space:pre">     </span>r0, #0</div>
<div> <span class="" style="white-space:pre">  </span>JMPc(lr, eq)</div><div> <span class="" style="white-space:pre">     </span>cmp<span class="" style="white-space:pre">       </span>r0, r1</div><div>-<span class="" style="white-space:pre">    </span>IT cc</div>
<div> <span class="" style="white-space:pre">  </span>JMPc(lr, cc)</div><div> <span class="" style="white-space:pre">     </span>/*</div><div> <span class="" style="white-space:pre">       </span> * Implement division using binary long division algorithm.</div>
<div>@@ -103,9 +90,8 @@ DEFINE_COMPILERRT_FUNCTION(__umodsi3)</div><div> </div><div> #define<span class="" style="white-space:pre">   </span>IMM<span class="" style="white-space:pre">       </span>#</div><div> </div><div>-#define block(shift)                                                           \</div>
<div>-<span class="" style="white-space:pre">   </span>cmp<span class="" style="white-space:pre">       </span>r0, r1, lsl IMM shift;                                         \</div><div>-<span class="" style="white-space:pre">      </span>IT hs;                                                                 \</div>
<div>+#define block(shift) \</div><div>+<span class="" style="white-space:pre">     </span>cmp<span class="" style="white-space:pre">       </span>r0, r1, lsl IMM shift; \</div><div> <span class="" style="white-space:pre"> </span>subhs<span class="" style="white-space:pre">     </span>r0, r0, r1, lsl IMM shift</div>
<div> </div><div> <span class="" style="white-space:pre"> </span>block(31)</div><div>-- </div><div>2.0.0.526.g5318336</div></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">
On Thu, Jul 17, 2014 at 1:12 AM, Stephen Hines <span dir="ltr"><<a href="mailto:srhines@google.com" target="_blank">srhines@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Many of the files in this change use computed jumps and it seems like there were no tests checked in to verify the new Thumb-2 behavior. The original change also claims that the "code remains compatible with older CPUs with no adverse effects". I dispute that there are no adverse effects here, since the Thumb-2 predicated instructions certainly have different timings/behavior than their equivalent ARM counterparts.<div>

<br></div><div>Steve</div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jul 17, 2014 at 1:02 AM, Tim Northover <span dir="ltr"><<a href="mailto:t.p.northover@gmail.com" target="_blank">t.p.northover@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Stephen,<br>
<div><br>
<br>
> The umodsi3 code computes jump targets based on ARM encodings (if CLZ is<br>
> present and IDIV is not present).<br>
<br>
</div>Reverting everything seems a little extreme if only a few functions<br>
are computing jumps like this. Can't we be a bit more targeted?<br>
<br>
Cheers.<br>
<span><font color="#888888"><br>
Tim.<br>
</font></span></blockquote></div><br></div>
</div></div></blockquote></div><br></div>