Updated online help documentation.

author: jasplin <qt-info@nokia.com> 2011-01-19 11:31:46 +0100
committer: jasplin <qt-info@nokia.com> 2011-01-19 11:31:46 +0100
commit: 2582b5ad4c3360ec63eb82df4253c93a0de19757 (patch)
tree: 5e604d78ab96974df9407d0543ef293ee57f6ee1 /web/getstats/help_tsbm.html
parent: 7af7ec23e722e315420c36268a85b45a2de63f13 (diff)
1 files changed, 365 insertions, 0 deletions
diff --git a/web/getstats/help_tsbm.html b/web/getstats/help_tsbm.html
new file mode 100644
index 0000000..81fd50d
--- /dev/null
+++ b/web/getstats/help_tsbm.html
@@ -0,0 +1,365 @@
+<html>
+
+<head>
+
+<style type="text/css">
+table { border-collapse:collapse; }
+table, th, td { border: 1px solid #aaaaaa; }
+th {vertical-align:top; padding:10px}
+td {vertical-align:top; padding:10px}
+li {padding:5px}
+</style>
+
+</head>
+
+<body>
+
+<span style="text-align:center">
+<h1>BM2 Documentation</h1>
+<h2><em>Benchmark Time Series Page</em></h2>
+</span>
+
+
+<!-- ///////////////////////////////////////////////////////////////// -->
+<h2>Overview</h2>
+&nbsp;<img src="images/help_overview_anno.png" /><br /><br />
+
+<h3>Main context</h3>
+
+<b>Note:</b> Some of the terms and concepts in the below table are described
+in greater detail elsewhere. The additional documentation can often be
+accessed through a link.
+<br /><br />
+
+<table>
+  <tr>
+    <td style="text-align:right"><b>Database</b></td>
+    <td>
+      The database from which results for the time series were extracted.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Report date</b></td>
+    <td>
+      The date at which the web page was generated.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Host</b></td>
+    <td>
+      The physical computer on which the benchmark producing this time series
+      was executed. <b>Note:</b>It is assumed that the HW/SW specifications of
+      the host does not change significantly during the time span of the time
+      series. (The principle here being that significant changes in the time
+      series should be caused by changes in the product (i.e. Qt) only!)
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Platform</b></td>
+    <td>
+      The general environment used for building and executing the product
+      being measured (typically an OS/compiler combination).
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Branch</b></td>
+    <td>
+      Essentially the version of the product being measured. The branch is
+      normally made up of two components: The <i>git repository</i> and the
+      <i>git branch</i>.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Target snapshots</b></td>
+    <td>
+      The requested subsequence of snapshots for which results for this
+      host/platform/branch/benchmark/metric combination potentially exist in
+      the database.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Difference tolerance</b></td>
+    <td>
+      A real value &ge; 1 that decides whether a <a href="#changes">change</a>
+      between two median observations in the time series is considered
+      significant or not.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Minimum durability tolerance</b></td>
+    <td>
+      The minimum length a contiguous sequence of significantly equal median
+      observations must have for it to achieve a durability score greater
+      than zero. Once the sequence is at least this long, the durability
+      score grows linearly to 1 at a rate that depends on the maximum
+      durability tolerance.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>Maximum durability tolerance</b></td>
+    <td>
+      The length of a contiguous sequence of significantly equal median
+      observations that is sufficient to achieve the maximum durability score
+      of 1. The durability score for shorter sequences falls linearly to 0 at
+      a rate that depends on the minimum durability tolerance.
+    </td>
+  </tr>
+</table>
+
+
+<h3>Time series statistics</h3>
+
+<b>Note:</b> Some of the terms and concepts in the below table are described
+in greater detail elsewhere. The additional documentation can often be
+accessed through a link.
+<br /><br />
+
+<table>
+  <tr>
+    <td style="text-align:right"><b>MS</b></td>
+    <td>
+      Missing snapshots, i.e. the number of target snapshots for which no
+      results exist.
+      <br /><br />A high value might indicate unstable execution.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LSD</b></td>
+    <td>
+      Last Snapshot Distance, i.e. distance between the last target snapshot
+      and the last snapshot in the time series.
+      <br /><br />If the last target snapshot is the last one available in the
+      database, a high value might indicate that the benchmark currently fails
+      to produce results.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>NI</b></td>
+    <td>
+      Total number of observations explicitly flagged as invalid.
+      <br /><br />An invalid observation is typically caused by a failed
+      QVERIFY() etc.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>NZ</b></td>
+    <td>
+      Total number of non-positive observations.
+      <br /><br />Normally an observation must be positive to be valid.
+      <br /><b>Note:</b> A non-positive observation is not necessarily flagged
+      as <i>invalid</i> (see <b>NI</b>).
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>NC</b></td>
+    <td>
+      Number of <a href="#changes">significant changes</a>.
+      <br /><br />A high value might indicate unstable or fluctuating results.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>MDRSE</b></td>
+    <td>
+      Median of the
+      valid <a href="http://en.wikipedia.org/wiki/Standard_error_%28statistics%29#Relative_standard_error">relative
+      standard errors</a> of all snapshots.
+      <br /><br />
+      &nbsp;&nbsp;&nbsp;&nbsp;
+      <img src="images/rse.png" />
+      <br /><br />A high value might indicate unstable or fluctuating results.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>RSEMD</b></td>
+    <td>
+      Relative standard error (see above) of the valid median observations of
+      all snapshots.
+      <br /><br />A high value might indicate either 1) unstable or
+      fluctuating results or 2) stable changes of a high magnitude.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LC</b></td>
+    <td>
+      Last <a href="#changes">significant change</a>.
+      <br /><br />The higher the value is above 1, the more strongly it
+      represents an improvement.
+      <br />The lower the value is below 1, the more strongly it represents a
+      regression.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCDA</b></td>
+    <td>
+      Days ago (relative to the report date) since the first observation for
+      the last <a href="#changes">significant change</a> snapshot was uploaded
+      to the database.
+      <br />The distance (in terms of number of target snapshots) between the
+      last significant change snapshot and the last target snapshot is shown
+      in parentheses.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCMS</b></td>
+    <td>
+      Magnitude score of the last <a href="#changes">significant
+      change</a>. This score indicates the strength of the last signicifant
+      change as a value ranging from 0 (weak) to 1 (strong):
+      <br /><br />&nbsp;&nbsp;&nbsp;&nbsp;
+      <img src="images/lcms.png" />
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCGSS</b></td>
+    <td>
+      Global separation score for the last <a href="#changes">significant
+      change</a>. This score indicates how well the median observation at the
+      last significant change snapshot are separated from the median
+      observations at <u>all preceding</u> snapshots in the time series. The
+      median observation at the last <a href="#changes">base snapshot</a> is
+      used as the maximum separation reference.
+
+      <br /><br />The score ranges from 0 (weak separation) to 1 (strong
+      separation).
+
+      <br /><br />This score roughly measures how close the median observation
+      at the last significant change is to represent an "all time high(low)"
+      up to this point in the history.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCLSS</b></td>
+    <td>
+      Local separation score for the last <a href="#changes">significant
+      change</a>. This score indicates how well the median observations on
+      each side of the last significant change snapshot are separated from
+      each other. Snapshots before the last <a href="#changes">base
+      snapshot</a> are not considered. The median observation at the base
+      snapshot is used as the maximum separation reference.
+
+      <br /><br />The score ranges from 0 (weak separation) to 1 (strong
+      separation).
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCDS1</b></td>
+    <td>
+      Durability score 1 for the last <a href="#changes">significant
+      change</a>. This score indicates the distance (in terms of number of
+      snapshots) from the last significant change to
+      its <a href="#changes">base snapshot</a>.
+
+      <br /><br />The score ranges from 0 (weak durability) to 1 (strong
+      durability) and is scaled against the min/max durability tolerances.
+
+      <br /><br />This score measures for how long the median observation
+      stayed near the base value until the last significant change occurred.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCDS2</b></td>
+    <td>
+      Durability score 2 for the last <a href="#changes">significant
+      change</a>. This score indicates the distance (in terms of number of
+      snapshots) from the last significant change to the end of the time
+      series.
+
+      <br /><br />The score ranges from 0 (weak durability) to 1 (strong
+      durability) and is scaled against the min/max durability tolerances.
+
+      <br /><br />This score measures for how long the median observation at
+      the last significant change has stayed essentially the same.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCSS</b></td>
+    <td>
+      Stability score for the last <a href="#changes">significant change</a>:
+      <br /><br />
+      &nbsp;&nbsp;&nbsp;&nbsp;
+      <b>LCMS</b> * <b>LCGSS</b> * <b>LCLSS</b> * <b>LCDS1</b> * <b>LCDS2</b>
+      <br /><br />
+      The higher this score, the higher the likelihood that the last
+      significant change is or will become permanent.
+    </td>
+  </tr>
+  <tr>
+    <td style="text-align:right"><b>LCSS1</b></td>
+    <td>
+      Stability score for the last <a href="#changes">significant change</a>
+      that does not consider the history after the latter:
+      <br /><br />
+      &nbsp;&nbsp;&nbsp;&nbsp;
+      <b>LCMS</b> * <b>LCGSS</b> * <b>LCLSS</b> * <b>LCDS1</b>
+      <br /><br />
+      The higher this score, the higher the likelihood that the last
+      signicifant change is or will become permanent, but since <b>LCDS2</b>
+      is omitted from the product, a high <b>LCSS1</b> is more likely to be
+      caused by an outlier than a high <b>LCSS</b>!
+    </td>
+  </tr>
+</table>
+
+
+
+<h3>Benchmark and metric</h3>
+The benchmark name consists of three subnames and is formatted like this:
+<br /><br />
+&lt;subname 1&gt;<b>:</b>&lt;subname 2&gt;<b>(</b>&lt;subname 3&gt;<b>)</b>
+<br /><br />
+
+The subnames can essentially be anything not containing the characters
+'<b>:</b>', '<b>(</b>', and '<b>)</b>'. Only subname 3 may contain whitespace.
+
+For benchmark results generated by QTestLib, the subnames always correspond to
+<i>test case</i>, <i>test function</i>, and <i>data tag</i> respectively.
+
+<br /><br />
+The metric name is one of a set of predefined metric names, each of which is
+classified as either "lower is better" (like walltime) or "higher is better"
+(like fps).
+
+<!-- ///////////////////////////////////////////////////////////////// -->
+<h2>Time Series Plot</h2>
+
+<h3>Snapshots and main graph</h3>
+&nbsp;<img src="images/help_plot_overview_anno.png" /><br /><br />
+
+
+<a name="changes" />
+<h3>Significant changes</h3>
+&nbsp;<img src="images/help_plot_changes_anno.png" /><br /><br />
+
+
+<h3>Sample size</h3>
+&nbsp;<img src="images/help_plot_samplesize_anno.png" /><br /><br />
+
+
+<h3>Non-positive observations</h3>
+&nbsp;<img src="images/help_plot_nonposobs_anno.png" /><br /><br />
+
+
+<h3>Invalid observations</h3>
+&nbsp;<img src="images/help_plot_invalidobs_anno.png" /><br /><br />
+
+
+<h3>Statistical dispersion</h3>
+Statistical dispersion in a time series is measured in terms of
+<a href="http://en.wikipedia.org/wiki/Standard_error_%28statistics%29#Relative_standard_error">relative standard error</a> (RSE).
+&nbsp;<img src="images/help_plot_rse_anno.png" /><br /><br />
+
+
+<h3>Missing data</h3>
+&nbsp;<img src="images/help_plot_missing_anno.png" /><br /><br />
+
+
+<h2>Snapshot details</h2>
+When clicking on a snapshot in the plot, the two tables below the plot are
+filled with various details about the selected snapshot.
+<br />
+<br />
+&nbsp;<img src="images/help_plot_selected_anno.png" /><br /><br />
+
+</body>
+
+</html>
author	jasplin <qt-info@nokia.com>	2011-01-19 11:31:46 +0100
committer	jasplin <qt-info@nokia.com>	2011-01-19 11:31:46 +0100
commit	2582b5ad4c3360ec63eb82df4253c93a0de19757 (patch)
tree	5e604d78ab96974df9407d0543ef293ee57f6ee1 /web/getstats/help_tsbm.html
parent	7af7ec23e722e315420c36268a85b45a2de63f13 (diff)