[PATCH] D33343: Add some tips on how to benchhmark

Rafael Ávila de Espíndola via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 19 09:44:06 PDT 2017


rafael updated this revision to Diff 99581.

https://reviews.llvm.org/D33343

Files:
  docs/Benchmarking.rst
  docs/index.rst


Index: docs/index.rst
===================================================================
--- docs/index.rst
+++ docs/index.rst
@@ -90,6 +90,7 @@
    CodeOfConduct
    CompileCudaWithLLVM
    ReportingGuide
+   Benchmarking
 
 :doc:`GettingStarted`
    Discusses how to get up and running quickly with the LLVM infrastructure.
Index: docs/Benchmarking.rst
===================================================================
--- /dev/null
+++ docs/Benchmarking.rst
@@ -0,0 +1,66 @@
+==================================
+Benchmarking tips
+==================================
+
+
+Introduction
+============
+
+For benchmarking a patch we want to be in control of all the possible sources of noise. How to do that is very OS dependent.
+
+Linux
+================================
+
+* Static link. That avoids any variation that might be introduced by
+  loading dynamic libraries. This can be done by passing
+  ``-DLLVM_BUILD_STATIC=ON`` to cmake.
+
+* Use tmpfs for all the parts. Putting the program, inputs and outputs
+  on tmpfs avoids touching a real storage system, which can have a
+  pretty big variability.
+
+  To mount it::
+
+    mount -t tmpfs -o size=<XX>g none dir_to_mount
+
+* Disable address space randomization::
+
+    echo 0 > /proc/sys/kernel/randomize_va_space
+
+* Disable turbo mode::
+
+    echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
+
+* Set scaling_governor to performance::
+
+   for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+   do
+     echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+   done
+
+* Use https://github.com/lpechacek/cpuset to reserve cpus for just the
+  program you are benchmarking. If using perf, leave at least 2 cores
+  so that perf runs in one and your program in another::
+
+    cset shield -c N1,N2 -k on
+
+  This will move all threads out of N1 and N2. The ``-k on`` means
+  that even kernel threads are moved out.
+
+* Disable the SMT pair of the cpus you will use for the benchmark. The
+  pair of cpu N can be found in
+  ``/sys/devices/system/cpu/cpuN/topology/thread_siblings_list`` and
+  disabled with::
+
+    echo 0 > /sys/devices/system/cpu/cpuX/online
+
+
+* Run the program with::
+
+    cset shield --exec -- perf stat -r 10 <cmd>
+
+  This will run the command after ``--`` in the isolated cpus. The
+  particular perf command runs the ``<cmd>`` 10 times and reports
+  statistics.
+
+With these in place you can expect perf variations of less than 0.1%.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D33343.99581.patch
Type: text/x-patch
Size: 2481 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170519/1ffd5f44/attachment.bin>


More information about the llvm-commits mailing list