summaryrefslogtreecommitdiffstats
path: root/docs
diff options
context:
space:
mode:
authorRafael Espindola <rafael.espindola@gmail.com>2017-05-24 16:39:12 +0000
committerRafael Espindola <rafael.espindola@gmail.com>2017-05-24 16:39:12 +0000
commit01c176bc5995f774ce9a546ba3aef15240d8b68f (patch)
tree846a5d383b5e88c87ea60576076e490aebc5f6d4 /docs
parent772effdbda010869a8b330f53d343929372b7ac7 (diff)
Add some tips on benchmarking.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303769 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r--docs/Benchmarking.rst87
-rw-r--r--docs/index.rst1
2 files changed, 88 insertions, 0 deletions
diff --git a/docs/Benchmarking.rst b/docs/Benchmarking.rst
new file mode 100644
index 000000000000..0f88db745a68
--- /dev/null
+++ b/docs/Benchmarking.rst
@@ -0,0 +1,87 @@
+==================================
+Benchmarking tips
+==================================
+
+
+Introduction
+============
+
+For benchmarking a patch we want to reduce all possible sources of
+noise as much as possible. How to do that is very OS dependent.
+
+Note that low noise is required, but not sufficient. It does not
+exclude measurement bias. See
+https://www.cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf for
+example.
+
+General
+================================
+
+* Use a high resolution timer, e.g. perf under linux.
+
+* Run the benchmark multiple times to be able to recognize noise.
+
+* Disable as many processes or services as possible on the target system.
+
+* Disable frequency scaling, turbo boost and address space
+ randomization (see OS specific section).
+
+* Static link if the OS supports it. That avoids any variation that
+ might be introduced by loading dynamic libraries. This can be done
+ by passing ``-DLLVM_BUILD_STATIC=ON`` to cmake.
+
+* Try to avoid storage. On some systems you can use tmpfs. Putting the
+ program, inputs and outputs on tmpfs avoids touching a real storage
+ system, which can have a pretty big variability.
+
+ To mount it (on linux and freebsd at least)::
+
+ mount -t tmpfs -o size=<XX>g none dir_to_mount
+
+Linux
+=====
+
+* Disable address space randomization::
+
+ echo 0 > /proc/sys/kernel/randomize_va_space
+
+* Set scaling_governor to performance::
+
+ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+ do
+ echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+ done
+
+* Use https://github.com/lpechacek/cpuset to reserve cpus for just the
+ program you are benchmarking. If using perf, leave at least 2 cores
+ so that perf runs in one and your program in another::
+
+ cset shield -c N1,N2 -k on
+
+ This will move all threads out of N1 and N2. The ``-k on`` means
+ that even kernel threads are moved out.
+
+* Disable the SMT pair of the cpus you will use for the benchmark. The
+ pair of cpu N can be found in
+ ``/sys/devices/system/cpu/cpuN/topology/thread_siblings_list`` and
+ disabled with::
+
+ echo 0 > /sys/devices/system/cpu/cpuX/online
+
+
+* Run the program with::
+
+ cset shield --exec -- perf stat -r 10 <cmd>
+
+ This will run the command after ``--`` in the isolated cpus. The
+ particular perf command runs the ``<cmd>`` 10 times and reports
+ statistics.
+
+With these in place you can expect perf variations of less than 0.1%.
+
+Linux Intel
+-----------
+
+* Disable turbo mode::
+
+ echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
diff --git a/docs/index.rst b/docs/index.rst
index fe47eb1bcb7f..becbe48e7ec7 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -90,6 +90,7 @@ representation.
CodeOfConduct
CompileCudaWithLLVM
ReportingGuide
+ Benchmarking
:doc:`GettingStarted`
Discusses how to get up and running quickly with the LLVM infrastructure.