[PATCH] D47208: [profile] Support profiling runtime on Fuchsia

Roland McGrath via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jul 14 02:30:53 PDT 2018


mcgrathr added a comment.

LGTM with these few substantive nits and then a whole lot more comments (mostly a more thorough overview covering linkage and output model).



================
Comment at: clang/lib/Driver/ToolChains/Fuchsia.cpp:106
   AddLinkerInputs(ToolChain, Inputs, Args, CmdArgs, JA);
+  ToolChain.addProfileRTLibs(Args, CmdArgs);
 
----------------
mcgrathr wrote:
> addSanitizerRuntimes should also be after AddLinkerInputs.
still relevant, though could be an unrelated change


================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c:33
+/* VMO that contains the coverage data shared across all modules. */
+zx_handle_t __llvm_profile_vmo;
+/* Current offset within the VMO where data should written to. */
----------------
Each symbol defined in this file without static or COMPILER_RT_VISIBILITY (hidden) needs to have an explicit comment clearly saying that this symbol is exported in each dynamic executable and DSO that statically links in the profiling runtime, and why that is desireable and what the protocol is for sharing a single set of definitions from one of those at runtime.

I think the VMO and offset are the only things we actually want to share.
Each instrumented module can have its own runtime version's implementation of data_write() as long as they cooperate in the simple append-to-VMO protocol using just those two globals.
I think the data_write() function here should be static.

I think it is exactly just those two symbols that should be exported, so they can have a short comment here and a longer explanatory comment about the whole inter-module sharing model can be part of the overview comment at the top of the file.


================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c:34
+zx_handle_t __llvm_profile_vmo;
+/* Current offset within the VMO where data should written to. */
+uint64_t __llvm_profile_offset;
----------------
s/should written to/should be written next/


================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c:70
+  }
+
+  /* Compute the total length of data to be written. */
----------------
So, this needs some overall comments on the methodology of this function and when/how it's called.
I'll now guess what's going on and the comments here should be both clearer than my guess and also actually accurate. :-)

The first time through, it will allocate the VMO and use __llvm_profile_get_size_for_buffer() as the initial size.
What size does this represent?  I think maybe it's the lower bound on the data this module (i.e. what statically links in this runtime) will need to write to the VMO, perhaps the exact total of the primary data not including the value-profiling data or something like that.  Please be explicit about what why it's a useful number to use as the initial VMO size given that you always resize it to cover the data in this writer call.
It's presumed that when __llvm_profile_vmo is set, __llvm_profile_offset remains 0.

After that, and also on any subsequent calls, it will resize the VMO to cover the incoming __llvm_profile_offset value plus the total being written in this call.  So, each call from the first onwards serially appends an arbitrary amount to the VMO, which has been published on creation.

But actually "first call" and "subsequent calls" are misnomers.  Each incarnation of this function will only ever be called once.  Each module (executable or DSO) statically links in the whole profile runtime to satisfy the calls from its instrumented code.  Several modules in the same program might be separately compiled and even use different versions of the instrumentation ABI and data format.  All they share in common is the VMO and the offset, which live in exported globals so that exactly one definition will be shared across all modules.  (Ideally they'd be exported as STB_GNU_UNIQUE even, so that multiple isolated modules loaded by dlopen into a program with no global-scope modules exporting these two globals would still share a VMO.  But that case is unlikely, and Fuchsia's dynamic linker doesn't usefully support STB_GNU_UNIQUE so far.)  Each module has its own independent runtime that registers its own atexit hook to write its own data.  The data format is sufficiently chunked and self-describing that simply concatenating the data streams from several unrelated incarnations of the runtime (that might even be from different versions of the compiler instrumentation) is sufficient to sort the data out easily offline later.

The alternative to the shared globals approach is that each module's atexit hook is totally unaware of the others and each creates its own VMO and does its own __sanitizer_publish_data call.  That wouldn't be the worst thing, and that's the "failure mode" of the pathological dlopen case mentioned in the aside above.  It just means that the data sink receiver has to be ready to handle multiple VMOs for the same sink name that originate from the same process.  As things stand, that means multiple VMOs with the same name.  I think that should actually be fine and there's no need for this code to try to mitigate it (e.g. by also putting the VMO KOID into the name).  Rather, every implementation of the data sink service ought to be prepared for arbitrary (or no) names in the VMOs, so e.g. ones that write files might always choose the file name to be "${sink_name}.${vmo_koid}.${vmo_name}" so as to be guaranteed unique (by dint of the VMO KOID) as well as informative.

Most of this background belongs in the overview comment at the top of the file.  Comments within this function should be clear about how multiple incarnations of this function are interacting in the "first time" if branch and later.



================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c:86
+    if (IOVecs[I].Data) {
+      Status = zx_vmo_write(__llvm_profile_vmo, IOVecs[I].Data,
+                            __llvm_profile_offset, Length);
----------------
Make sure to always use the _zx_* names.


================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c:109
+  if (lprofProfileDumped()) {
+    PROF_NOTE("Profile data not published: %s.\n",
+              "already written");
----------------
PROF_NOTE and PROF_ERR and PROF_WARN are fortunately so far confinsed to InstrProfilingFile.c, which is code not used for Fuchsia.
I think the macros (and a few more related to file stuff) should be moved from InstrProfilingPort.h to inside InstrProfilingFile.c if possible.
There should be a big comment at the top of this file saying that on Fuchsia we never use stdio (fprintf to stderr is what PROT_NOTE/PROF_ERR  are).  Instead always write whole lines (or multiple lines) with `__sanitizer_log_write`.  You can make a local helper that uses vsnprintf into a temporary buffer and calls `__sanitizer_log_write`, if your messages have need of formatting.
(Actually we might well make libc provide `__sanitizer_{v,}printf` or even just `FILE*__sanitizer_fopenlog(void)`, but you shouldn't wait for that.)



================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c:149
+int __llvm_profile_register_write_file_atexit(void) {
+  static int HasBeenRegistered = 0;
+
----------------
I'd use <stdbool.h> bool here (and in general C11, or at least C99, freely and modern style preferentially), unless there's a C portability constraint I don't know about for lib/profile.


Repository:
  rCRT Compiler Runtime

https://reviews.llvm.org/D47208





More information about the llvm-commits mailing list