from dbaccess import execQuery from statlib import stats from string import hexdigits import sys import os import re import urllib # ### 2 B DOCUMENTED! def idToText(table, id_): global idToTextCache if not 'idToTextCache' in globals(): idToTextCache = {} if not table in idToTextCache: idToTextCache[table] = {} if id_ in idToTextCache[table]: return idToTextCache[table][id_] text = (execQuery( "SELECT value FROM " + table + " WHERE id = %s", (id_,))[0][0] if (id_ >= 0) else "") idToTextCache[table][id_] = text return text # ### 2 B DOCUMENTED! def textToId(table, text): global textToIdCache if not 'textToIdCache' in globals(): textToIdCache = {} if not table in textToIdCache: textToIdCache[table] = {} if text in textToIdCache[table]: return textToIdCache[table][text] res = execQuery("SELECT id FROM " + table + " WHERE value = %s", (text,)) id_ = res[0][0] if (len(res) > 0) else -1 textToIdCache[table][text] = id_ return id_ # ### 2 B DOCUMENTED! # Maybe also rename to lowerIsBetter() ? (but note that a global function with # that name already exists in uploadresults.py) def metricIdToLowerIsBetter(metric_id): global metricIdToLIBCache if not 'metricIdToLIBCache' in globals(): metricIdToLIBCache = {} if metric_id in metricIdToLIBCache: return metricIdToLIBCache[metric_id] lib = execQuery( "SELECT lowerIsBetter FROM metric WHERE id = %s", (metric_id,))[0][0] metricIdToLIBCache[metric_id] = lib return lib # Returns the non-negative ID of the given context, or -1 if not found. def getContext(host_id, platform_id, branch_id, sha1_id): result = execQuery( "SELECT id FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " AND sha1Id = %s" "LIMIT 1", (host_id, platform_id, branch_id, sha1_id)) if len(result): return result[0][0] return -1 # Returns the test case, test function, and data tag components of # a benchmark of the form :(
). The test case and test function # components may not contain whitespace, or a ':', '(', or ')' character. # The data tag component will be stripped on return, but may be empty. def benchmarkToComponents(benchmark): p = re.compile("^([^:\s\(\)]+):([^:\s\(\)]+)\((.*)\)$") m = p.match(benchmark) if m: return m.group(1), m.group(2), m.group(3).strip() else: raise BaseException("invalid benchmark syntax: >" + benchmark + "<") # Returns the timestamp associated with a particular context ID. This will # be the UTC timestamp of the earliest upload for this context. def getTimestampFromContext(context_id): return execQuery( "SELECT EXTRACT(EPOCH FROM timestamp)::INT FROM context WHERE id = %s", (context_id,))[0][0] # Finds snapshots that match a host/platform/branch combination and that # lie within the range # [sha11, sha12] if sha12_id >= 0, or # [sha11, +inf) if sha12_ is < 0. # Returns a chronologically order n-tuple of 2-tuples: # (sha1, first upload timestamp). def getSnapshots(host_id, platform_id, branch_id, sha11_id, sha12_id): timestamp1 = execQuery( "SELECT EXTRACT(EPOCH FROM timestamp)::INT FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " AND sha1Id = %s", (host_id, platform_id, branch_id, sha11_id))[0][0] if sha12_id >= 0: timestamp2 = execQuery( "SELECT EXTRACT(EPOCH FROM timestamp)::INT FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " AND sha1Id = %s", (host_id, platform_id, branch_id, sha12_id))[0][0] # Ensure chronological order: if timestamp1 > timestamp2: timestamp1, timestamp2 = timestamp2, timestamp1 range_expr = "BETWEEN %d AND %d" % (timestamp1, timestamp2) else: range_expr = ">= %d" % timestamp1 # Each distinct SHA-1 that occurs for this host/platform/branch # combination may occur multiple times with different upload times. # Get the list of distinct SHA-1 IDs along with the earliest upload # time for each one (note: for simplicity, we assume that the # order of the uploadId attributes is consistent with the order # of their corresponding startTime attributes): snapshots = execQuery( "SELECT sha1Id, EXTRACT(EPOCH FROM timestamp)::INT AS firstUploadTime" " FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " AND EXTRACT(EPOCH FROM timestamp)::INT " + range_expr + " ORDER BY timestamp ASC", (host_id, platform_id, branch_id)) return tuple(snapshots) # Finds all snapshots matching a host/platform/branch combination. # Returns a chronologically (except when reverse is True) ordered n-tuple # of 2-tuples: (sha1, first upload timestamp). def getAllSnapshots(host_id, platform_id, branch_id, reverse = False): # Each distinct SHA-1 that occurs for this host/platform/branch # combination may occur multiple times with different upload times. # Get the list of distinct SHA-1 IDs along with the earliest upload # time for each one (note: for simplicity, we assume that the # order of the uploadId attributes is consistent with the order # of their corresponding startTime attributes): snapshots = execQuery( "SELECT sha1Id, EXTRACT(EPOCH FROM timestamp)::INT AS firstUploadTime" " FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " ORDER BY timestamp " + ("DESC" if reverse else "ASC"), (host_id, platform_id, branch_id)) return tuple(snapshots) # Returns the (SHA-1 ID, timestamp) pair associated with the most recent # rankings computed for the given host/platform/branch combination, or # (-1, -1) if no match is found. def getLastRankingSnapshot(host_id, platform_id, branch_id): result = execQuery( "SELECT matchingcontext.sha1id, EXTRACT(EPOCH FROM timestamp)::INT" " FROM ranking," " (SELECT id, sha1Id, timestamp" " FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s) AS matchingContext" " WHERE context2Id = matchingContext.id" " ORDER BY timestamp DESC LIMIT 1", (host_id, platform_id, branch_id)) if len(result): return result[0] return -1, -1 # For the given host/platform/branch combination, this function returns # all contexts for which rankings exist. The return value is a list of # (context ID, timestamp) pairs sorted in descending order on timestamp # (latest timestamp first). def getRankingContexts(host_id, platform_id, branch_id): result = execQuery( "SELECT DISTINCT matchingcontext.id," " EXTRACT(EPOCH FROM timestamp)::INT AS etimestamp" " FROM ranking," " (SELECT id, sha1Id, timestamp" " FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s) AS matchingContext" " WHERE context2Id = matchingContext.id" " ORDER BY etimestamp DESC", (host_id, platform_id, branch_id)) return result # Retrieves the time series of valid median results for the given # benchmark/metric combination. Only the part of the time series that # is within the selected snapshot interval is considered. # # Returns a 7-tuple: # # Component 1: An n-tuple of 6-tuples: # # ( # , # , # , # ( > 1) and (mean > 0) # ? ( [0, 1> ) # : -1, # , # # ) # # Component 2: Total number of invalid observations. # Component 3: Total number of non-positive observations. # Component 4: Median of all valid (i.e. non-negative) relative # standard errors (note: there is one RSE (valid or not) # per snapshot). # Component 5: Relative standard error of all valid observation medians # (note: there is zero or one such median per snapshot). # Component 6: Missing snaphots, i.e. the number of candidate snapshots # that are missing in the time series. # Component 7: Last snapshot distance, i.e. the gap between the last snapshot # in the time series and the last candidate snapshot. # def getTimeSeries( host_id, platform_id, branch_id, snapshots, benchmark_id, metric_id): contexts = [] for sha1_id, timestamp in snapshots: contexts.append(getContext(host_id, platform_id, branch_id, sha1_id)) # Fetch raw values: assert len(contexts) > 0 raw_values = (execQuery( "SELECT value, valid, contextId FROM result" " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ")" + " AND benchmarkId = %s" " AND metricId = %s" " ORDER BY contextId", tuple(contexts) + (benchmark_id, metric_id)) + [(-1, -1, -1)]) # Note sentinel item # Compute per-sample stats: curr_context_id = -1 sample = [] valid_and_positive_sample = [] valid_median_obs = [] ninvalid = 0 tot_ninvalid = 0 nzeros = 0 tot_nzeros = 0 rses = [] tsitem_map = {} # Loop over all observations (which are grouped on sample; # note the 1-1 correspondence between samples and contexts): for obs, valid, context_id in raw_values: if context_id != curr_context_id: # A new sample has been collected, so register it and # prepare for the next one: sample_size = len(sample) valid_and_positive_sample_size = len(valid_and_positive_sample) median_obs = -1 nrse = -1 if valid_and_positive_sample_size > 0: median_obs = stats.medianscore(valid_and_positive_sample) valid_median_obs.append(median_obs) if valid_and_positive_sample_size > 1: try: nrse = ( stats.sem(valid_and_positive_sample) / float(stats.mean(valid_and_positive_sample))) rses.append(100 * nrse) except ZeroDivisionError: pass tsitem_map[curr_context_id] = ( median_obs, sample_size, nrse, ninvalid, nzeros) sample = [] valid_and_positive_sample = [] tot_ninvalid = tot_ninvalid + ninvalid ninvalid = 0 tot_nzeros = tot_nzeros + nzeros nzeros = 0 curr_context_id = context_id # Append observation to current sample: sample.append(obs) if valid: if obs > 0: valid_and_positive_sample.append(obs) else: ninvalid = ninvalid + 1 if obs <= 0: nzeros = nzeros + 1 # Order chronologically: ts = [] index = 0 for context in contexts: if context in tsitem_map: tsitem = tsitem_map[context] ts.append( (index, tsitem[0], tsitem[1], tsitem[2], tsitem[3], tsitem[4])) index = index + 1 # Compute median of RSEs: if len(rses) > 0: median_of_rses = stats.medianscore(rses) else: median_of_rses = -1 # Compute RSE of valid median observations: if len(valid_median_obs) > 1: try: rse_of_medians = 100 * ( stats.sem(valid_median_obs) / float(stats.mean(valid_median_obs))) except ZeroDivisionError: rse_of_medians = -1 else: rse_of_medians = -1 ms = len(contexts) - len(ts) if len(ts) > 0: lsd = (len(contexts) - 1) - ts[-1][0] else: lsd = -1 return ( tuple(ts), tot_ninvalid, tot_nzeros, median_of_rses, rse_of_medians, ms, lsd) # Returns the factor by which val improves over base_val by taking the # lower_is_better property into consideration. # Example: base_val = 10 and val = 20 results in 0.5 if lower_is_better is true, # and 2 if lower_is_better is false. def metricAdjustedRatio(base_val, val, lower_is_better): #assert val1 > 0 #assert val2 > 0 return (base_val / val) if lower_is_better else (val / base_val) # Locates (significant) changes in a time series. # Whether a change is significant depends on the difftol argument. # Only positive values are considered. # # The output is an n-tuple of 7-tuples, one per change: # # 1: Base index, i.e. the index in the time series that contains the base # value used to compute the change. # 2: Change index, i.e. the index in the time series at which the change # occurs. The value at this index is called the change value. # NOTE: The change index of change i becomes the base index of # change i + 1. # 3: Metric-adjusted change ratio, i.e. the factor by which the change # value improves over the base value. # 4: Global separation score. This measures how well all values before # the change index are separated from the change value with respect to # the base value. # The score ranges from 0 (poor separation) to 1 (good separation). # If all target values lie on the far side of the base value # (as seen from the change value), the score is 1. # If at least one target value lies on the far side of the change value # (as seen from the base value), the score is 0. # Otherwise, the score depends on the value that is furthest away # from the base value in the direction of the change value. # # DEFINITIONS: # - Segment S1: The values in range [base index, change index>. # - Segment S2: The values in range [change index, next change index>. # (Or the end of the time series if there is no next change # index) # # 5: Local separation score. A variant of global separation score that # measures how well the values in segment S1 and segment S2 are separated # from each other. # The score ranges from 0 (low separation) to 1 (high separation). # # 6: Duration score for S1. This measures the number of values in S1 # with respect to durtolmin and durtolmax. # The score ranges from 0 (few values; low duration) to # 1 (many values; high duration). # The score is 0 if S1 contains fewer than durtolmin values. # The score is 1 if S1 contains at least durtolmax values. # Otherwise, the score depends on the number of values in S1. # # 7: Duration score for S2. (Ditto) # def getChanges(time_series, lower_is_better, difftol, durtolmin, durtolmax): if len(time_series) == 0: return () # Define the difference tolerance range. # Ratios within [lo, hi] are considered insignificant. assert difftol >= 1 hi = difftol lo = 1.0 / difftol values = zip(*time_series)[1] # Use the first positive value as the first base: base_pos = -1 base_val = -1 for i in range(0, len(values)): base_val = values[i] if base_val > 0: base_pos = i break if base_pos == -1: return () # No positive values found! # Initialize global extremas: gmin, gmax = values[base_pos], values[base_pos] segments = [] base_ratio = -1 prev_gmin = -1 prev_gmax = -1 # Compute the segments (Note: The segments are divided by the changes, # so if no changes exist, there will be only one segment) while True: # Initialize stats for current segment: lmin, lmax = values[base_pos], values[base_pos] # Local extremas pvals = 1 # Number of positive values # Scan to next change if any: change_found = False for pos in range(base_pos + 1, len(values)): val = values[pos] if val > 0: # The value is positive, so this is a potential change. # Compute the local change as the metric-adjusted ratio: ratio = metricAdjustedRatio(base_val, val, lower_is_better) # Check if change is significant: if (ratio > hi) or (ratio < lo): # Finalize current segment and prepare for next change: segments.append( (base_pos, base_ratio, prev_gmin, prev_gmax, lmin, lmax, pvals)) base_pos = pos base_val = val base_ratio = ratio prev_gmin = gmin prev_gmax = gmax change_found = True else: # Update stats for current segment: if val < lmin: lmin = val elif val > lmax: lmax = val pvals = pvals + 1 # Update global extremas: if val < gmin: gmin = val elif val > gmax: gmax = val if change_found: break if not change_found: # No next change was found, so finalize the current segment # as the last segment ... segments.append( (base_pos, base_ratio, prev_gmin, prev_gmax, lmin, lmax, pvals)) break # Compute the changes ... changes = [] for i in range(1, len(segments)): s1 = segments[i - 1] s2 = segments[i] base_pos = s1[0] change_pos = s2[0] ratio = s2[1] # Compute global and local separation scores: base_val = values[base_pos] change_val = values[change_pos] gmin = s2[2] gmax = s2[3] lmin1 = s1[4] lmax1 = s1[5] lmin2 = s2[4] lmax2 = s2[5] if change_val < base_val: gsep_score = (gmin - change_val) / float(base_val - change_val) lsep_score = (lmin1 - lmax2) / float(base_val - change_val) else: gsep_score = (change_val - gmax) / float(change_val - base_val) lsep_score = (lmin2 - lmax1) / float(change_val - base_val) gsep_score = min(max(gsep_score, 0), 1) lsep_score = min(max(lsep_score, 0), 1) # Compute duration scores: pvals1 = s1[6] pvals2 = s2[6] assert durtolmin > 0 assert durtolmin <= durtolmax sfact = 1.0 / (durtolmax - (durtolmin - 1)) dur_score1 = (pvals1 - (durtolmin - 1)) * sfact dur_score2 = (pvals2 - (durtolmin - 1)) * sfact dur_score1 = min(max(dur_score1, 0), 1) dur_score2 = min(max(dur_score2, 0), 1) changes.append( (base_pos, change_pos, ratio, gsep_score, lsep_score, dur_score1, dur_score2)) return tuple(changes) # ### 2 B DOCUMENTED! def getTimeSeriesMiscStats(time_series, changes, snapshots, stats): if len(changes) > 0: stats["lc"] = changes[-1][2] lc_ts_pos = changes[-1][1] lc_ss_pos = time_series[lc_ts_pos][0] stats["lc_timestamp"] = snapshots[lc_ss_pos][1] stats["lc_distance"] = (len(snapshots) - 1) - lc_ss_pos stats["lc_gsep_score"] = changes[-1][3] stats["lc_lsep_score"] = changes[-1][4] stats["lc_dur1_score"] = changes[-1][5] stats["lc_dur2_score"] = changes[-1][6] else: stats["lc"] = -1 stats["lc_timestamp"] = -1 stats["lc_distance"] = -1 stats["lc_gsep_score"] = -1 stats["lc_lsep_score"] = -1 stats["lc_dur1_score"] = -1 stats["lc_dur2_score"] = -1 # Computes per-benchmark time series statistics. # # ADD MORE DOCS HERE ... 2 B DONE! # def getBMTimeSeriesStatsList( host_id, platform_id, branch_id, snapshots, test_case_filter, difftol, durtolmin, durtolmax, progress_func = None, progress_arg = None): if progress_func != None: progress_func(0.0, progress_arg) contexts = [] for sha1_id, timestamp in snapshots: contexts.append(getContext(host_id, platform_id, branch_id, sha1_id)) # Get all distinct benchmark/metric combinations that match the # host/platform/branch context and are within the selected snapshot # interval. Each such combination corresponds to a time series. assert len(contexts) > 0 bmark_metrics = execQuery( "SELECT DISTINCT benchmarkId, metricId FROM result" " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ")", contexts) bmstats_list = [] # Loop over time series: if progress_func != None: i = 0 #for benchmark_id, metric_id in bmark_metrics[800:810]: for benchmark_id, metric_id in bmark_metrics: benchmark = idToText("benchmark", benchmark_id) #if benchmark != "tst_qmetaobject:indexOfMethod(_q_columnsAboutToBeRemoved(QModelIndex,int,int))": # continue test_case, test_function, data_tag = ( benchmarkToComponents(benchmark)) # Skip this time series if it doesn't match the test case filter: if ((test_case_filter != None) and (not test_case in test_case_filter)): continue # Get the content and some basic stats of the time series: (time_series, tot_ninvalid, tot_nzeros, median_of_rses, rse_of_medians, ms, lsd) = getTimeSeries( host_id, platform_id, branch_id, snapshots, benchmark_id, metric_id) # Extract the significant changes: changes = getChanges( time_series, metricIdToLowerIsBetter(metric_id), difftol, durtolmin, durtolmax) stats = {} stats["benchmark"] = benchmark stats["benchmark_id"] = benchmark_id stats["metric"] = idToText("metric", metric_id) stats["metric_id"] = metric_id stats["lib"] = metricIdToLowerIsBetter(metric_id) stats["ms"] = ms stats["lsd"] = lsd stats["ni"] = tot_ninvalid stats["nz"] = tot_nzeros stats["nc"] = len(changes) stats["med_of_rses"] = median_of_rses stats["rse_of_meds"] = rse_of_medians getTimeSeriesMiscStats(time_series, changes, snapshots, stats) bmstats_list.append(stats) if progress_func != None: i = i + 1 divisor = len(bmark_metrics) // 100 # report at most 100 times if (divisor > 0) and ((i % divisor) == 0): perc_done = (i / float(len(bmark_metrics))) * 100.0 progress_func(perc_done, progress_arg) if progress_func != None: progress_func(100.0, progress_arg) return tuple(bmstats_list) # Returns True iff s is a valid SHA-1 string. def isValidSHA1(s): def containsOnlyHexDigits(s): return 0 not in [c in hexdigits for c in s] return (len(s) == 40) and containsOnlyHexDigits(s) def printJSONHeader(): print "Content-type: text/json\n" def printErrorAsJSON(error): printJSONHeader() print "{\"error\": \"" + error + "\"}\n" # Returns a 2-tuple consisting of: # 1: an option dictionary, and # 2: a flag that is true iff the QUERY_STRING environment variable is # present (i.e. that the script is invoked as a CGI-script for a # HTTP GET request). # # The option dictionary is extracted from either the QUERY_STRING environment # variable (first priority) or command-line arguments (second priority). # In the latter case, the options must be of the form # ... -- ... -- ... def getOptions(): def getOptDictFromQueryString(qs): options = {} for sq in qs.split("&"): keyval = sq.split("=") options[keyval[0]] = urllib.unquote(keyval[1]) return options def getOptDictFromCommandLine(): options = {} p = re.compile("^--(.+)$") key = None for arg in sys.argv[1:]: if key != None: options[key] = arg m = p.match(arg) if m: key = m.group(1) # Support "--help" as the only value-less option: if key == "help": options[key] = 1 key = None else: key = None return options qs = "QUERY_STRING" if qs in os.environ: return (getOptDictFromQueryString(os.environ[qs]), True) else: return (getOptDictFromCommandLine(), False)