summaryrefslogtreecommitdiffstats
path: root/web/getstats/help_tsbm.html
diff options
context:
space:
mode:
authorjasplin <qt-info@nokia.com>2011-01-19 11:31:46 +0100
committerjasplin <qt-info@nokia.com>2011-01-19 11:31:46 +0100
commit2582b5ad4c3360ec63eb82df4253c93a0de19757 (patch)
tree5e604d78ab96974df9407d0543ef293ee57f6ee1 /web/getstats/help_tsbm.html
parent7af7ec23e722e315420c36268a85b45a2de63f13 (diff)
Updated online help documentation.
Diffstat (limited to 'web/getstats/help_tsbm.html')
-rw-r--r--web/getstats/help_tsbm.html365
1 files changed, 365 insertions, 0 deletions
diff --git a/web/getstats/help_tsbm.html b/web/getstats/help_tsbm.html
new file mode 100644
index 0000000..81fd50d
--- /dev/null
+++ b/web/getstats/help_tsbm.html
@@ -0,0 +1,365 @@
+<html>
+
+<head>
+
+<style type="text/css">
+table { border-collapse:collapse; }
+table, th, td { border: 1px solid #aaaaaa; }
+th {vertical-align:top; padding:10px}
+td {vertical-align:top; padding:10px}
+li {padding:5px}
+</style>
+
+</head>
+
+<body>
+
+<span style="text-align:center">
+<h1>BM2 Documentation</h1>
+<h2><em>Benchmark Time Series Page</em></h2>
+</span>
+
+
+<!-- ///////////////////////////////////////////////////////////////// -->
+<h2>Overview</h2>
+&nbsp;<img src="images/help_overview_anno.png" /><br /><br />
+
+<h3>Main context</h3>
+
+<b>Note:</b> Some of the terms and concepts in the below table are described
+in greater detail elsewhere. The additional documentation can often be
+accessed through a link.
+<br /><br />
+
+<table>
+ <tr>
+ <td style="text-align:right"><b>Database</b></td>
+ <td>
+ The database from which results for the time series were extracted.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Report date</b></td>
+ <td>
+ The date at which the web page was generated.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Host</b></td>
+ <td>
+ The physical computer on which the benchmark producing this time series
+ was executed. <b>Note:</b>It is assumed that the HW/SW specifications of
+ the host does not change significantly during the time span of the time
+ series. (The principle here being that significant changes in the time
+ series should be caused by changes in the product (i.e. Qt) only!)
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Platform</b></td>
+ <td>
+ The general environment used for building and executing the product
+ being measured (typically an OS/compiler combination).
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Branch</b></td>
+ <td>
+ Essentially the version of the product being measured. The branch is
+ normally made up of two components: The <i>git repository</i> and the
+ <i>git branch</i>.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Target snapshots</b></td>
+ <td>
+ The requested subsequence of snapshots for which results for this
+ host/platform/branch/benchmark/metric combination potentially exist in
+ the database.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Difference tolerance</b></td>
+ <td>
+ A real value &ge; 1 that decides whether a <a href="#changes">change</a>
+ between two median observations in the time series is considered
+ significant or not.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Minimum durability tolerance</b></td>
+ <td>
+ The minimum length a contiguous sequence of significantly equal median
+ observations must have for it to achieve a durability score greater
+ than zero. Once the sequence is at least this long, the durability
+ score grows linearly to 1 at a rate that depends on the maximum
+ durability tolerance.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>Maximum durability tolerance</b></td>
+ <td>
+ The length of a contiguous sequence of significantly equal median
+ observations that is sufficient to achieve the maximum durability score
+ of 1. The durability score for shorter sequences falls linearly to 0 at
+ a rate that depends on the minimum durability tolerance.
+ </td>
+ </tr>
+</table>
+
+
+<h3>Time series statistics</h3>
+
+<b>Note:</b> Some of the terms and concepts in the below table are described
+in greater detail elsewhere. The additional documentation can often be
+accessed through a link.
+<br /><br />
+
+<table>
+ <tr>
+ <td style="text-align:right"><b>MS</b></td>
+ <td>
+ Missing snapshots, i.e. the number of target snapshots for which no
+ results exist.
+ <br /><br />A high value might indicate unstable execution.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LSD</b></td>
+ <td>
+ Last Snapshot Distance, i.e. distance between the last target snapshot
+ and the last snapshot in the time series.
+ <br /><br />If the last target snapshot is the last one available in the
+ database, a high value might indicate that the benchmark currently fails
+ to produce results.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>NI</b></td>
+ <td>
+ Total number of observations explicitly flagged as invalid.
+ <br /><br />An invalid observation is typically caused by a failed
+ QVERIFY() etc.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>NZ</b></td>
+ <td>
+ Total number of non-positive observations.
+ <br /><br />Normally an observation must be positive to be valid.
+ <br /><b>Note:</b> A non-positive observation is not necessarily flagged
+ as <i>invalid</i> (see <b>NI</b>).
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>NC</b></td>
+ <td>
+ Number of <a href="#changes">significant changes</a>.
+ <br /><br />A high value might indicate unstable or fluctuating results.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>MDRSE</b></td>
+ <td>
+ Median of the
+ valid <a href="http://en.wikipedia.org/wiki/Standard_error_%28statistics%29#Relative_standard_error">relative
+ standard errors</a> of all snapshots.
+ <br /><br />
+ &nbsp;&nbsp;&nbsp;&nbsp;
+ <img src="images/rse.png" />
+ <br /><br />A high value might indicate unstable or fluctuating results.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>RSEMD</b></td>
+ <td>
+ Relative standard error (see above) of the valid median observations of
+ all snapshots.
+ <br /><br />A high value might indicate either 1) unstable or
+ fluctuating results or 2) stable changes of a high magnitude.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LC</b></td>
+ <td>
+ Last <a href="#changes">significant change</a>.
+ <br /><br />The higher the value is above 1, the more strongly it
+ represents an improvement.
+ <br />The lower the value is below 1, the more strongly it represents a
+ regression.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCDA</b></td>
+ <td>
+ Days ago (relative to the report date) since the first observation for
+ the last <a href="#changes">significant change</a> snapshot was uploaded
+ to the database.
+ <br />The distance (in terms of number of target snapshots) between the
+ last significant change snapshot and the last target snapshot is shown
+ in parentheses.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCMS</b></td>
+ <td>
+ Magnitude score of the last <a href="#changes">significant
+ change</a>. This score indicates the strength of the last signicifant
+ change as a value ranging from 0 (weak) to 1 (strong):
+ <br /><br />&nbsp;&nbsp;&nbsp;&nbsp;
+ <img src="images/lcms.png" />
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCGSS</b></td>
+ <td>
+ Global separation score for the last <a href="#changes">significant
+ change</a>. This score indicates how well the median observation at the
+ last significant change snapshot are separated from the median
+ observations at <u>all preceding</u> snapshots in the time series. The
+ median observation at the last <a href="#changes">base snapshot</a> is
+ used as the maximum separation reference.
+
+ <br /><br />The score ranges from 0 (weak separation) to 1 (strong
+ separation).
+
+ <br /><br />This score roughly measures how close the median observation
+ at the last significant change is to represent an "all time high(low)"
+ up to this point in the history.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCLSS</b></td>
+ <td>
+ Local separation score for the last <a href="#changes">significant
+ change</a>. This score indicates how well the median observations on
+ each side of the last significant change snapshot are separated from
+ each other. Snapshots before the last <a href="#changes">base
+ snapshot</a> are not considered. The median observation at the base
+ snapshot is used as the maximum separation reference.
+
+ <br /><br />The score ranges from 0 (weak separation) to 1 (strong
+ separation).
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCDS1</b></td>
+ <td>
+ Durability score 1 for the last <a href="#changes">significant
+ change</a>. This score indicates the distance (in terms of number of
+ snapshots) from the last significant change to
+ its <a href="#changes">base snapshot</a>.
+
+ <br /><br />The score ranges from 0 (weak durability) to 1 (strong
+ durability) and is scaled against the min/max durability tolerances.
+
+ <br /><br />This score measures for how long the median observation
+ stayed near the base value until the last significant change occurred.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCDS2</b></td>
+ <td>
+ Durability score 2 for the last <a href="#changes">significant
+ change</a>. This score indicates the distance (in terms of number of
+ snapshots) from the last significant change to the end of the time
+ series.
+
+ <br /><br />The score ranges from 0 (weak durability) to 1 (strong
+ durability) and is scaled against the min/max durability tolerances.
+
+ <br /><br />This score measures for how long the median observation at
+ the last significant change has stayed essentially the same.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCSS</b></td>
+ <td>
+ Stability score for the last <a href="#changes">significant change</a>:
+ <br /><br />
+ &nbsp;&nbsp;&nbsp;&nbsp;
+ <b>LCMS</b> * <b>LCGSS</b> * <b>LCLSS</b> * <b>LCDS1</b> * <b>LCDS2</b>
+ <br /><br />
+ The higher this score, the higher the likelihood that the last
+ significant change is or will become permanent.
+ </td>
+ </tr>
+ <tr>
+ <td style="text-align:right"><b>LCSS1</b></td>
+ <td>
+ Stability score for the last <a href="#changes">significant change</a>
+ that does not consider the history after the latter:
+ <br /><br />
+ &nbsp;&nbsp;&nbsp;&nbsp;
+ <b>LCMS</b> * <b>LCGSS</b> * <b>LCLSS</b> * <b>LCDS1</b>
+ <br /><br />
+ The higher this score, the higher the likelihood that the last
+ signicifant change is or will become permanent, but since <b>LCDS2</b>
+ is omitted from the product, a high <b>LCSS1</b> is more likely to be
+ caused by an outlier than a high <b>LCSS</b>!
+ </td>
+ </tr>
+</table>
+
+
+
+<h3>Benchmark and metric</h3>
+The benchmark name consists of three subnames and is formatted like this:
+<br /><br />
+&lt;subname 1&gt;<b>:</b>&lt;subname 2&gt;<b>(</b>&lt;subname 3&gt;<b>)</b>
+<br /><br />
+
+The subnames can essentially be anything not containing the characters
+'<b>:</b>', '<b>(</b>', and '<b>)</b>'. Only subname 3 may contain whitespace.
+
+For benchmark results generated by QTestLib, the subnames always correspond to
+<i>test case</i>, <i>test function</i>, and <i>data tag</i> respectively.
+
+<br /><br />
+The metric name is one of a set of predefined metric names, each of which is
+classified as either "lower is better" (like walltime) or "higher is better"
+(like fps).
+
+<!-- ///////////////////////////////////////////////////////////////// -->
+<h2>Time Series Plot</h2>
+
+<h3>Snapshots and main graph</h3>
+&nbsp;<img src="images/help_plot_overview_anno.png" /><br /><br />
+
+
+<a name="changes" />
+<h3>Significant changes</h3>
+&nbsp;<img src="images/help_plot_changes_anno.png" /><br /><br />
+
+
+<h3>Sample size</h3>
+&nbsp;<img src="images/help_plot_samplesize_anno.png" /><br /><br />
+
+
+<h3>Non-positive observations</h3>
+&nbsp;<img src="images/help_plot_nonposobs_anno.png" /><br /><br />
+
+
+<h3>Invalid observations</h3>
+&nbsp;<img src="images/help_plot_invalidobs_anno.png" /><br /><br />
+
+
+<h3>Statistical dispersion</h3>
+Statistical dispersion in a time series is measured in terms of
+<a href="http://en.wikipedia.org/wiki/Standard_error_%28statistics%29#Relative_standard_error">relative standard error</a> (RSE).
+&nbsp;<img src="images/help_plot_rse_anno.png" /><br /><br />
+
+
+<h3>Missing data</h3>
+&nbsp;<img src="images/help_plot_missing_anno.png" /><br /><br />
+
+
+<h2>Snapshot details</h2>
+When clicking on a snapshot in the plot, the two tables below the plot are
+filled with various details about the selected snapshot.
+<br />
+<br />
+&nbsp;<img src="images/help_plot_selected_anno.png" /><br /><br />
+
+</body>
+
+</html>