[PATCH] D23933: [XRay] ARM 32-bit no-Thumb support in compiler-rt

Renato Golin via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 30 06:38:06 PDT 2016

rengolin requested changes to this revision.
rengolin added a comment.
This revision now requires changes to proceed.


I have a number of comments and requests. One general remark, also, is to comment on the top of the assembly functions what's the function signature in C, so that I know how to review the function's code. Otherwise, it's very hard to understand all possibilities.


Comment at: lib/sanitizer_common/scripts/gen_dynamic_list.py:54
@@ -52,3 +53,3 @@
   # On PowerPC, nm prints function descriptors from .data section.
-  if os.uname()[4] in ["powerpc", "ppc64"]:
+  if platform.uname()[4] in ["powerpc", "ppc64"]:
     func_symbols += ['D']
Have you tested this on Linux and Mac? To make sure it also work there?

Comment at: lib/xray/xray_arm.cc:22
@@ +21,3 @@
+// basis. See xray_trampoline_*.s files for implementations.
+extern void __xray_FunctionEntry();
+extern void __xray_FunctionExit();
Can't you define this on a top-level header and implement on an arch-specific cpp file?

I don't think these things should be changing between arches.

Comment at: lib/xray/xray_arm.cc:38
@@ +37,3 @@
+// 0xUUUUWXYZ -> 0x000W0XYZ
+inline static uint32_t GetMovwMask(const uint32_t Value) {
+  return (Value & 0xfff) | ((Value & 0xf000) << 4);
dberris wrote:
> Shouldn't these functions be camelCase? As in `getMoveMask(...)` according to the guide? I understand you're just following the conventions of the files around, and those mistakes are mine -- but do you mind changing them before landing?
Yes, please. Let's not add different styles if we don't have to. Camel case, caps for variables, no caps for functions.

Enum values have format INI_Name, with "INI" the initials of the enum's name, all caps, and the Name a unique identifier within the enum.

Comment at: lib/xray/xray_arm.cc:50
@@ +49,3 @@
+//   MOVT r0, #<higher 16 bits of the |Value|>
+inline static uint32_t *Write32bitLoadR0(uint32_t *Address,
+                                         const uint32_t Value) {
Why haven't use used inline assembly, here? This is really unreadable and error prone.

Comment at: lib/xray/xray_arm.cc:110
@@ +109,3 @@
+    CurAddress++;
+    *CurAddress = uint32_t(PatchOpcodes::PO_Pop_r0_lr);
+    std::atomic_store_explicit(
All modern ARM CPUs are multi-issue, out-of-order, so you cannot guarantee ordering without a data/memory barrier. ARMv8 has better atomic support.

Comment at: lib/xray/xray_inmemory_log.cc:30
@@ +29,3 @@
+#elif defined(__arm__)
+  static const int64_t cNanosecondsPerSecond = 1000LL*1000*1000;
dberris wrote:
> `static constexpr` instead?
> Also, please follow the naming conventions for this one too.
> Another thing -- couldn't you just use std::chrono for the constant here?
> http://en.cppreference.com/w/cpp/chrono/duration
Better still, get the proper value?

I know it can be a tad different from hardware to hardware, but not even trying isn't really helpful.

Comment at: lib/xray/xray_inmemory_log.cc:70
@@ -62,2 +69,3 @@
+#if !defined(__arm__)
 static std::pair<ssize_t, bool> retryingReadSome(int Fd, char *Begin,
I'd use #defined __x86_64__ instead, and replicate to all arches it's supposed to work later.

We don't want broken fall-back logic, as it's really hard to find bugs later.

Comment at: lib/xray/xray_inmemory_log.cc:188
@@ +187,3 @@
+#elif defined(__arm__)
+    CPUFrequency = cNanosecondsPerSecond;
I don't understand this... Is this just hard-coding to 1GHz?


Also works on most ARM and AArch64 boards.

Comment at: lib/xray/xray_trampoline_arm.S:1
@@ +1,2 @@
+    .syntax unified
+	.code 32
You're going to need more than that.

Assemblers are very picky on what's valid, and Clang is specially so.

You'll need to put the minimum requirements on the header (cpu, fpu, arch, thumb-interop, etc).

You'll also need to put ".arch" to extend support on functions that use new instructions where the header flags don't support.


    .syntax unifixed
    .arch   armv6t2
    .fpu    vfpv2
    .fpu vfpv3
    VMOV ...

This will mean you can use this on v6T2 onward, and that "v7_only_func" can only be used by arches with vfpv3, guaranteed by the dynamic dispatch. See libunwind and other compiler-RT ARM functions for this behaviour.

Comment at: lib/xray/xray_trampoline_arm.S:6
@@ +5,3 @@
+    // It preserves all registers except r0, r12(ip), r14(lr) and r15(pc)
+    // Assume that "q" part of the floating-point registers is not used in C/C++
+	.global __xray_FunctionEntry // let C/C++ see the symbol
Why not? Vectorizers can and do use Q regs...

Comment at: lib/xray/xray_trampoline_arm.S:7
@@ +6,3 @@
+    // Assume that "q" part of the floating-point registers is not used in C/C++
+	.global __xray_FunctionEntry // let C/C++ see the symbol
+	.type __xray_FunctionEntry, %function
The comment character is "@" not "//". This will only work if compiled by a C++ compiler, not an assembler.


More information about the llvm-commits mailing list