Add support for negative result values.

This patch adds support for negative result values to BM (thus allowing all real values). A typical use case for negative values is a benchmark that measures the difference between two implementations of the same problem/task. In order to support negative values, the formula for computing the relative difference between v1 and v2 has been changed from (a - b) / (b + 1) (or (a - b) / (a + 1) ) to (a == b) ? 0 : ((a - b) / max(|a|, |b|)) Note that adding 1 in the first formula was necessarry to allow for zero values. This is no longer needed. Note also that the new formula doesn't allow direct selection of the normalization base (i.e. the denominator), but rather automatically choosing the one with the largest magnitude. This is assumed not to be a problem, and also has the interesting property that the absolute value of the expression is bounded by 2. This patch also renames the term "current change" to "last difference".
author: jasplin <qt-info@nokia.com> 2010-01-13 08:21:25 +0100
committer: jasplin <qt-info@nokia.com> 2010-01-13 08:21:25 +0100
commit: 6133f24bcd9dfeaff42826c6e370519cef299b24 (patch)
tree: 6e6d89c96132e0701ae510f7f6777860f4db67c5 /doc
parent: 97cddd2c7d8b1e7ee3d489378f6ab3073052ea39 (diff)
2 files changed, 115 insertions, 120 deletions
diff --git a/doc/bmclient.html b/doc/bmclient.html
index 66163d4..ac491bb 100644
--- a/doc/bmclient.html
+++ b/doc/bmclient.html
@@ -426,7 +426,7 @@ specified value.
 snapshot with this exact value to exist for the command to succeed.
 
 <br /><br />
-For the sake of the following description, denote the input time series as:
+For the sake of the following description, denote the input time series as
 <br /><br />
 &nbsp;&nbsp;&nbsp;&nbsp;
 (t<sub>1</sub>,&nbsp;v<sub>1</sub>),&nbsp;(t<sub>2</sub>,&nbsp;v<sub>2</sub>),&nbsp;
@@ -436,63 +436,92 @@ where t<sub><i>i</i></sub> is the timestamp component of snapshot <i>i</i> and
 v<sub><i>i</i></sub> the corresponding result value.
 
 <br /><br />
-<b>Difference</b>
+<b>Normalized Difference</b>
 <br /><br />
 
-For two values v<sub><i>i</i></sub> and v<sub><i>j</i></sub> in the time
-series, the <em>difference tolerance</em> with respect to v<sub><i>j</i></sub>
-(specified as <span class="command">&lt;diff&nbsp;tolerance&gt;</span>) is the
-the maximum percentage by which v<sub><i>i</i></sub>&nbsp;+&nbsp;1 is allowed
-to differ from v<sub><i>j</i></sub>&nbsp;+&nbsp;1 to be considered equal to
-v<sub><i>j</i></sub>&nbsp;+&nbsp;1. Values that are equal/different by this
-definition are said to be <i>significantly</i> equal/different. <b>Note:</b>
-the 1's are added to allow for v<sub><i>j</i></sub> to be 0.
+For our purpose we define the <i>normalized difference</i> between two real
+numbers a and b, nd(a,&nbsp;b), as
+<br /><br />
+&nbsp;&nbsp;&nbsp;&nbsp;
+(a - b) / max(|a|, |b|)
+<br /><br />
+if a &ne; b, and
+<br /><br />
+&nbsp;&nbsp;&nbsp;&nbsp;
+0
+<br /><br />
+otherwise.
 
 <br /><br />
+Note the following properties of the normalized difference:
 
-<span class="command">&lt;diff&nbsp;tolerance&gt;</span> must be a
-non-negative real number.
+<ul>
+<li>0 &le; |nd(a, b)| &le; 2</li>
+<li>nd(a, b) &gt; 0 &rArr; a > b</li>
+<li>nd(a, b) &lt; 0 &rArr; a < b</li>
+<li>nd(a, b) = 0 &rArr; a = b</li>
+<li>0 &lt; |nd(a, b)| &lt; 1 &rArr; a and b are either both positive or both negative</li>
+<li>|nd(a, b)| = 1 &rArr; exactly one of a and b is zero</li>
+<li>1 &lt; |nd(a, b)| &lt; 2 &rArr; a and b have opposite signs (and neither is zero)</li>
+<li>|nd(a, b)| = 2 &rArr; a = -b</li>
+</ul>
 
+<br />
+<b>Difference Tolerance</b>
 <br /><br />
-<b>Value Stability</b>
+
+The <i>difference tolerance</i> (specified
+as <span class="command">&lt;diff&nbsp;tolerance&gt;</span>) defines the
+maximum normalized difference that is accepted between two values for them to
+be considered equal.
+
+<br />
+Values that are equal/different by this definition are said to
+be <i>significantly</i> equal/different.
+
 <br /><br />
 
-The value at index <i>i</i> is said to be <i>stable</i> if it is preceded
-consecutively by at least <span
-class="command">&lt;stab&nbsp;tolerance&gt;</span> values that
-don't differ significantly from v<sub><i>i</i></sub>.
+<b>Note</b>: The difference tolerance must be a non-negative real number.
 
 <br /><br />
+<b>Stability Tolerance</b>
+<br /><br />
 
-<span class="command">&lt;stab&nbsp;tolerance&gt;</span> must be a
-non-negative integer.
+The <i>stability tolerance</i> (specified
+as <span class="command">&lt;stab&nbsp;tolerance&gt;</span>) is used together
+with the difference tolerance (defined above) to decide whether a value in the
+time series is stable ("reproducible"):
 
 <br /><br />
-<b>Current Change</b>
+The value v is said to be <i>stable</i> if it is preceded contiguously by at
+least <span class="command">&lt;stab&nbsp;tolerance&gt;</span> values that
+<i>don't differ significantly</i> from v.
+
 <br /><br />
 
-This is the position of the most recent value (if any) that differs
-significantly from the last value. More precisely, this is the largest
-<i>c</i> such that <br /><br /> &nbsp;&nbsp;&nbsp;&nbsp;
-|((v<sub><i>n</i></sub>&nbsp;-&nbsp;v<sub><i>c</i></sub>)&nbsp;/(&nbsp;v<sub><i>n</i></sub>
-&nbsp;+&nbsp;1))&nbsp;*&nbsp;100|&nbsp;>&nbsp;<span
-class="command">&lt;diff&nbsp;tolerance&gt;</span>.
+The stability tolerance must be a non-negative integer.
 
 <br /><br />
-<b>Note:</b> the 1 is added to allow for v<sub><i>n</i></sub> to be 0.
+<b>Last Difference</b>
+<br /><br />
+
+The last difference of a time series refers to the last value in the
+subset of values (if any) that <i>differ significantly</i> from the last
+value in the series.
 
 <br /><br />
-Observe how the above formula implies that a <i>positive</i> current change
-difference suggests a (possible) performance <i>regression</i> since the
-latest execution of the benchmark spent more "resources" than the execution
-that produced the current change.
+<b>Note</b>: Whether the last difference represents a performance regression
+can be determined from the sign of the normalized difference between the last
+value and the last difference (a positive sign typically indicating a
+regression).
 
 <br /><br />
-<b>Current Change Stability</b>
+<b>Strong Stability of the Last Difference</b>
 <br /><br />
 
-The current change <i>c</i> is said to be <i>stable</i> if and only if both
-v<sub><i>n</i></sub> and v<sub><i>c</i></sub> are stable.
+The last difference of the time series is said to be <i>stable in the strong
+sense</i> if and only if the last value and the last difference itself are
+both stable (in the normal sense).
 
 <br /><br />
 
@@ -501,7 +530,7 @@ v<sub><i>n</i></sub> and v<sub><i>c</i></sub> are stable.
 
 The trend is estimated as the <i>&beta;</i> value of the <a
 href="http://en.wikipedia.org/wiki/Simple_linear_regression">simple linear
-regression</a> of the normalized time series:
+regression</a> of the time series:
 <br /><br />
 &nbsp;&nbsp;&nbsp;&nbsp;
 <i>y</i>&nbsp;=&nbsp;<i>&alpha;</i>&nbsp;+&nbsp;<i>&beta;x</i>
@@ -513,34 +542,6 @@ the value at time <i>t</i>&nbsp;+&nbsp;<i>dt</i> is expected to be
 <i>v</i>(<i>t</i>)&nbsp;+&nbsp;<i>&beta;</i>*<i>dt</i>.
 
 <br /><br />
-
-Observe that positive and negative <i>&beta;</i> values indicate a decrease
-and increase in performance respectively.
-
-<br /><br />
-
-The normalized time series is defined like this:
-<ol>
-<li>
-Scale the timestamps linearly into the [0,&nbsp;1] range.
-</li>
-<li>
-
-Replace each value v<sub><i>i</i></sub> by
-(v<sub><i>i</i></sub>&nbsp;+&nbsp;1)&nbsp;/&nbsp;(v<sub><i>n</i></sub>&nbsp;+&nbsp;1).
-
-<br /><br />
-<b>Note:</b> the 1's are added to allow for v<sub><i>n</i></sub> to be 0.
-<br /><br />
-
-The normalized value 1 thus indicates no change, while values 0.5 and 1.5 indicate
-around 50% better and worse than the last value respectively. In the latter
-case the ratio converges to 50% as the magnitude of the pre-normalized value
-increases, since the effect of adding 1 then becomes gradually less significant).
-</li>
-</ol>
-
-<br />
 <b>Limiting the output size</b>
 <br /><br />
 
@@ -572,15 +573,15 @@ the <span class="command"><b>get detailspage</b></span> command:
 <td>
 <pre>
 &lt;index (0 = first) \
-    of the current \
-    change in the \
+    of the last \
+    difference in the \
     results list \
     below, or -1&gt;
-&lt;current change \
+&lt;last difference \
     stability (true, \
     false, or -1)&gt;
 &lt;timestamp of the \
-    current change, \
+    last difference, \
     or -1&gt;
 &lt;sha1 (ditto)&gt;
 &lt;value (ditto)&gt;
@@ -605,7 +606,7 @@ the <span class="command"><b>get detailspage</b></span> command:
 Content-type: text/json
 
 {
-    "currChange": {
+    "lastDiff": {
         "index": &lt;index in results
                  list, or -1&gt;,
         "stable": &lt;true or false&gt;,
@@ -614,7 +615,7 @@ Content-type: text/json
 	    "&lt;value&gt;"]
     },
     // Property omitted if no
-    // current change exists
+    // last difference exists
 
     "lastValue": {
         "stable": &lt;true or false&gt;
@@ -725,8 +726,8 @@ Content-type: text/html
 <td class="commandDescr">
 
 For each benchmark within a specific context, this command computes the
-<i>current change</i>. The output includes the subset of benchmarks whose
-current changes are considered most important by some user-defined criteria.
+<i>last difference</i>. The output includes the subset of benchmarks whose
+last differences are considered most important by some user-defined criteria.
 
 <br /><br />
 
@@ -750,21 +751,21 @@ one. The size and contents of this list is controlled by
 <br /><br />
 The <span class="command">&lt;ranking&gt;</span> defines which of two
 benchmarks is considered the most important one.  The value
-<code><b>worst</b></code> indicates that a positive <i>current change</i>
+<code><b>worst</b></code> indicates that a positive <i>last difference</i>
 value is more important than a negative one (in other words: a performance
 <i>decrease</i> is more important than a performance <i>increase</i>). The
 value <code><b>best</b></code> indicates the exact opposite: performance
 increases rank over decreases. Finally, the value
-<code><b>absolute</b></code> indicates that a large current change (regardless
+<code><b>absolute</b></code> indicates that a large last difference (regardless
 of the sign) is more important than a small one (in other words: it is the
-<i>magnitude</i> of the current change that counts).
+<i>magnitude</i> of the last difference that counts).
 
 <br /><br />
 
-A stable current change always ranks over a non-stable one (regardless of the
+A stable last difference always ranks over a non-stable one (regardless of the
 value of <span class="command">&lt;ranking&gt;</span>), while an existing
-current change always ranks over a non-existing one. Two benchmarks without
-any current change are ranked in an arbitrary order.
+last difference always ranks over a non-existing one. Two benchmarks without
+any last difference are ranked in an arbitrary order.
 
 <br /><br />
 
@@ -791,8 +792,8 @@ Observe that unless <span class="command">&lt;max&nbsp;size&gt;</span> is 0,
 the <code><b>testFunction</b></code> scope guarantees that all test functions
 appear in the output. Conversely, the <code><b>global</b></code> scope may
 result in one or more test functions being dropped from the output (i.e. a
-small number of test functions may contain current change values that are more
-important than other test functions).
+small number of test functions may contain last difference values that are
+more important than those of other test functions).
 
 </td>
 <td>
@@ -819,14 +820,14 @@ Content-type: text/json
             // Property omitted if no
             // last value exists
 
-            "currChange": [
+            "lastDiff": [
                 &lt;timestamp&gt;,
                 "&lt;sha1&gt;",
                 &lt;value&gt;,
                 &lt;stable&gt;
             ],
             // Property omitted if no
-            // current change exists
+            // last difference exists
 
             "regression": [&lt;a&gt;, &lt;b&gt;]
             // Property omitted if the
@@ -857,9 +858,9 @@ Content-type: text/json
 <td class="commandDescr">
 
 For each benchmark in two specific branches within a specific context, this
-command computes the difference between the last result in each branch. The
-output includes the subset of benchmarks whose differences are considered most
-important by some user-defined criteria.
+command computes the normalized difference between the last result in each
+branch. The output includes the subset of benchmarks whose differences are
+considered most important by some user-defined criteria.
 
 <br /><br />
 <b>Controlling the output list</b>
@@ -869,17 +870,11 @@ The output consists of a list of benchmarks along with basic statistics for
 the last value in each branch. The size and contents of this list is
 controlled in exactly the same way as in the <span class="command"><b>get
 rankedbenchmarks</b></span> command except that the subject of ranking is now
-the difference between the last result of each branch (rather than the current
-change). The difference is defined like this:
-
-<br /><br /> &nbsp;&nbsp;&nbsp;&nbsp;
-(v1&nbsp;-&nbsp;v2)&nbsp;/&nbsp;(v2&nbsp;+&nbsp;1)
+the <i>normalized difference</i> between the last result of each branch,
+i.e. nd(v1,&nbsp;v2) where v1 and v2 denote the last value of Branch 1 and
+Branch 2 respectively. (See earlier definition of normalized difference.)
 <br /><br />
 
-where v1 and v2 denote the last value of Branch 1 and Branch 2 respectively. A
-positive difference thus indicates that Branch 1 currently performs worse than
-Branch 2 (since Branch 1 spends more "resources" running the benchmark).
-
 </td>
 <td>
 <span style="color:red"><i>not supported yet</i></span>
@@ -942,8 +937,8 @@ Computes basic statistics for the benchmarks within a certain context.
 <br /><br />
 
 See the <span class="command"><b>get history</b></span> command for a
-definition of <i>snapshot range</i>, <i>current change</i>, <i>current change
-stability</i>, <i>difference tolerance</i>, and <i>stability tolerance</i>.
+definition of <i>snapshot range</i>, <i>last difference</i> (and its
+stability), <i>difference tolerance</i>, and <i>stability tolerance</i>.
 
 <td>
 <span style="color:red"><i>not supported yet</i></span>
@@ -954,17 +949,17 @@ Content-type: text/json
 {
     "nBenchmarks":
         &lt;number of benchmarks found&gt;,
-    "nCurrChanges":
-        &lt;number of current changes found&gt;,
-    "nStableCurrChanges":
-         &lt;number of stable current
-          changes found&gt;,
-    "minStableCurrChange":
-         &lt;minimum stable current change&gt;,
-    "avgStableCurrChange":
-         &lt;average stable current change&gt;,
-    "maxStableCurrChange":
-         &lt;maximum stable current change&gt;
+    "nLastDiffs":
+        &lt;number of last differences found&gt;,
+    "nStableLastDiffs":
+        &lt;number of stable last
+        differences found&gt;,
+    "minStableLastDiff":
+        &lt;minimum stable last difference&gt;,
+    "avgStableLastDiff":
+        &lt;average stable last difference&gt;,
+    "maxStableLastDiff":
+        &lt;maximum stable last difference&gt;
 }
 </pre>
 </td>
diff --git a/doc/bmproto.html b/doc/bmproto.html
index ad1de5d..70178b5 100644
--- a/doc/bmproto.html
+++ b/doc/bmproto.html
@@ -278,12 +278,12 @@ the specified value.
 <br /><br />
 <b>Notes:</b>
 <ul>
-<li>The &lt;currChange&gt; tag is omitted altogether if no current change
+<li>The &lt;lastDiff&gt; tag is omitted altogether if no last difference
     exists (within the specified range).</li>
 <li>A non-negative value for the <i>index</i> attribute in the
-    &lt;currChange&gt; tag refers to the index (0 = first) of the current change
+    &lt;lastDiff&gt; tag refers to the index (0 = first) of the last difference
     in the list of &lt;result&gt; tags. (</b>A negative value means
-    that the list of &lt;result&gt; tags doesn't include the current change.)
+    that the list of &lt;result&gt; tags doesn't include the last difference.)
 <li>The &lt;lastValue&gt; tag is omitted altogether if no last value
     exists (i.e. the specified range is empty).</li>
 <li>The &lt;regression&gt; tag is omitted altogether if the regression
@@ -310,7 +310,7 @@ the specified value.
 <td>
 <pre>
 &lt;reply type="GetHistory"&gt;
-    &lt;currChange index="..." stable="..."
+    &lt;lastDiff index="..." stable="..."
         timestamp="..." sha1="..."
         value="..." /&gt;
     &lt;lastValue stable="..." /&gt;
@@ -356,7 +356,7 @@ Lists statistics and history for a specific benchmark in two branches.
 <td>
 <pre>
 &lt;reply type="GetHistory2"&gt;
-    &lt;currChange1 index="..." stable="..."
+    &lt;lastDiff1 index="..." stable="..."
         timestamp="..." sha1="..."
         value="..." /&gt;
     &lt;lastValue1 stable="..." /&gt;
@@ -366,7 +366,7 @@ Lists statistics and history for a specific benchmark in two branches.
     &lt;result1 timestamp="..." sha1="..."
         value="..." /&gt;
     ...
-    &lt;currChange2 index="..." stable="..."
+    &lt;lastDiff2 index="..." stable="..."
         timestamp="..." sha1="..."
         value="..." /&gt;
     &lt;lastValue2 stable="..." /&gt;
@@ -386,7 +386,7 @@ Lists statistics and history for a specific benchmark in two branches.
 <tr>
 <td class="requestName">GetRankedBenchmarks</td>
 <td class="requestDescr">
-Lists the benchmarks with the most important <i>current change</i> values.
+Lists the benchmarks with the most important <i>last difference</i> values.
 
 <br /><br />
 The snapshot range (timestamp{1..2} / sha1{1..2}) is defined as for the <span
@@ -394,7 +394,7 @@ class="requestName">GetHistory</span> request (see this).
 
 <br /><br />
 For each benchmark in the reply message, each of the &lt;lastValue&gt;,
-&lt;currChange&gt;, and &lt;regression&gt; is included only if the value
+&lt;lastDiff&gt;, and &lt;regression&gt; is included only if the value
 exists (or can be computed in the case of the regression).
 
 </td>
@@ -425,7 +425,7 @@ exists (or can be computed in the case of the regression).
         &lt;lastValue timestamp="..."
             sha1="..." value="..."
             stable="..." /&gt;
-        &lt;currChange timestamp="..."
+        &lt;lastDiff timestamp="..."
             sha1="..." value="..."
             stable="..." /&gt;
         &lt;regression a="..." b="..." /&gt;
@@ -506,11 +506,11 @@ Computes basic statistics for the benchmarks within a certain context.
 <pre>
 &lt;reply type="GetStats"&gt;
     &lt;stats nBenchmarks="..."
-        nCurrChanges="..."
-        nStableCurrChanges="..."
-        minStableCurrChange="..."
-        avgStableCurrChange="..."
-        maxStableCurrChange="..." /&gt;
+        nLastDiffs="..."
+        nStableLastDiffs="..."
+        minStableLastDiff="..."
+        avgStableLastDiff="..."
+        maxStableLastDiff="..." /&gt;
 &lt;/reply&gt;
 </pre>
 </td>
author	jasplin <qt-info@nokia.com>	2010-01-13 08:21:25 +0100
committer	jasplin <qt-info@nokia.com>	2010-01-13 08:21:25 +0100
commit	6133f24bcd9dfeaff42826c6e370519cef299b24 (patch)
tree	6e6d89c96132e0701ae510f7f6777860f4db67c5 /doc
parent	97cddd2c7d8b1e7ee3d489378f6ab3073052ea39 (diff)