[PATCH] D34381: [XRay] [compiler-rt] unify all TSC API calls

Dean Michael Berris via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 21 00:24:03 PDT 2017


dberris added a subscriber: eizan.
dberris added a comment.

What are we trying to achieve here? It looks like it's sinking the decision on which implementation to invoke from the caller, down to the callee (`readTSC`) which gets less efficient.

While we don't have benchmarks yet upstream that measure the cost of these things, I'd really want to keep code in the readTSC(...) implementation to not have branches, and move branches out to the callers. It's unclear whether the benefit of reducing the arguments in `processFunctionHook` is worth it, if we're going to add more later anyway.

@eizan is working on getting some benchmarks upstream, but it would be good to make sure we're not unnecessarily being pessimistic with the implementation of key parts of the runtime.



================
Comment at: lib/xray/xray_x86_64.inc:22
 
+bool probeRequiredCPUFeatures();
+
----------------
Why do we need this forward declaration?


================
Comment at: lib/xray/xray_x86_64.inc:26-28
+  if (UNLIKELY(!TSCSupported)) {
+    return emulateReadTSC(CPU);
+  }
----------------
I think we're going to have issues with this code in terms of performance impact. This is introducing a function call and synchronization on the first call, then in subsequent calls will introduce an unavoidable branch.


https://reviews.llvm.org/D34381





More information about the llvm-commits mailing list