from dbaccess import execQuery from statlib import stats from string import hexdigits import sys import os import re import urllib # ### 2 B DOCUMENTED! def idToText(table, id_): global idToTextCache if not 'idToTextCache' in globals(): idToTextCache = {} if not table in idToTextCache: idToTextCache[table] = {} if id_ in idToTextCache[table]: return idToTextCache[table][id_] text = (execQuery( "SELECT value FROM " + table + " WHERE id = %s", (id_,))[0][0] if (id_ >= 0) else "") idToTextCache[table][id_] = text return text # ### 2 B DOCUMENTED! def textToId(table, text): global textToIdCache if not 'textToIdCache' in globals(): textToIdCache = {} if not table in textToIdCache: textToIdCache[table] = {} if text in textToIdCache[table]: return textToIdCache[table][text] res = execQuery("SELECT id FROM " + table + " WHERE value = %s", (text,)) id_ = res[0][0] if (len(res) > 0) else -1 textToIdCache[table][text] = id_ return id_ # ### 2 B DOCUMENTED! def findOrInsertId(table, value, *args): query_result = execQuery( "SELECT id FROM " + table + " WHERE value = %s", (value,)) if len(query_result) == 1: # Found, so return ID: return query_result[0][0] # Not found, so insert: query = "INSERT INTO " + table + " (value" for i in range(0, len(args), 2): query += ", " + args[i] query += ") VALUES (%s" values = [value] for i in range(0, len(args), 2): query += ", %s" values.append(args[i + 1]) # ... and retrieve ID: query += ") RETURNING id" query_result = execQuery(query, values) assert len(query_result) == 1 return query_result[0][0] # ### 2 B DOCUMENTED! # Maybe also rename to lowerIsBetter() ? (but note that a global function with # that name already exists in uploadresults.py) def metricIdToLowerIsBetter(metric_id): global metricIdToLIBCache if not 'metricIdToLIBCache' in globals(): metricIdToLIBCache = {} if metric_id in metricIdToLIBCache: return metricIdToLIBCache[metric_id] lib = execQuery( "SELECT lowerIsBetter FROM metric WHERE id = %s", (metric_id,))[0][0] metricIdToLIBCache[metric_id] = lib return lib # Returns the non-negative ID of the given context, or -1 if not found. def getContext(host_id, platform_id, branch_id, sha1_id): global contextIdCache if not 'contextIdCache' in globals(): contextIdCache = {} if (host_id, platform_id, branch_id, sha1_id) in contextIdCache: return contextIdCache[host_id, platform_id, branch_id, sha1_id] result = execQuery( "SELECT id FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " AND sha1Id = %s" "LIMIT 1", (host_id, platform_id, branch_id, sha1_id)) result = result[0][0] if len(result) else -1 contextIdCache[host_id, platform_id, branch_id, sha1_id] = result return result # Returns the test case, test function, and data tag components of # a benchmark of the form :(

). The test case and test function # components may not contain whitespace, or a ':', '(', or ')' character. # The data tag component will be stripped on return, but may be empty. def benchmarkToComponents(benchmark): p = re.compile("^([^:\s]+):([^:\s]+)$(.*)$$") m = p.match(benchmark) if m: return m.group(1), m.group(2), m.group(3).strip() else: raise BaseException("invalid benchmark syntax: >" + benchmark + "<") # Returns the timestamp associated with a particular context ID. This will # be the UTC timestamp of the earliest upload for this context. def getTimestampFromContext(context_id): return execQuery( "SELECT EXTRACT(EPOCH FROM timestamp)::INT FROM context WHERE id = %s", (context_id,))[0][0] # Finds snapshots that match a host/platform/branch combination and that # lie within the range # [sha11, sha12] if both sha11_id and sha12_id are >= 0, or # ( -inf, sha12] if only sha11_id is < 0, or # [sha11, +inf) if only sha12_id is < 0, or # ( -inf, +inf) if both sha11_id and sha2_id are < 0 # # Returns a chronologically order n-tuple of 2-tuples: # (sha1, first upload timestamp). def getSnapshots(host_id, platform_id, branch_id, sha11_id, sha12_id): timestamp1 = (execQuery( "SELECT EXTRACT(EPOCH FROM timestamp)::INT FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " AND sha1Id = %s", (host_id, platform_id, branch_id, sha11_id))[0][0] if (sha11_id >= 0) else -1) timestamp2 = (execQuery( "SELECT EXTRACT(EPOCH FROM timestamp)::INT FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " AND sha1Id = %s", (host_id, platform_id, branch_id, sha12_id))[0][0] if (sha12_id >= 0) else -1) prefix = "AND EXTRACT(EPOCH FROM timestamp)::INT" if (timestamp1 == -1) and (timestamp2 == -1): range_expr = "" elif (timestamp1 >= 0) and (timestamp2 < 0): range_expr = ("%s >= %d" % (prefix, timestamp1)) elif (timestamp1 < 0) and (timestamp2 >= 0): range_expr = ("%s <= %d" % (prefix, timestamp2)) else: # Ensure chronological order: if timestamp1 > timestamp2: timestamp1, timestamp2 = timestamp2, timestamp1 range_expr = ("%s BETWEEN %d AND %d" % (prefix, timestamp1, timestamp2)) # Each distinct SHA-1 that occurs for this host/platform/branch # combination may occur multiple times with different upload times. # Get the list of distinct SHA-1 IDs along with the earliest upload # time for each one: snapshots = execQuery( "SELECT sha1Id, EXTRACT(EPOCH FROM timestamp)::INT AS firstUploadTime" " FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s " + range_expr + " ORDER BY timestamp ASC", (host_id, platform_id, branch_id)) return tuple(snapshots) # Finds all snapshots matching a host/platform/branch combination. # Returns a chronologically (except when reverse is True) ordered n-tuple # of 2-tuples: (sha1, first upload timestamp). def getAllSnapshots(host_id, platform_id, branch_id, reverse = False): # Each distinct SHA-1 that occurs for this host/platform/branch # combination may occur multiple times with different upload times. # Get the list of distinct SHA-1 IDs along with the earliest upload # time for each one: snapshots = execQuery( "SELECT sha1Id, EXTRACT(EPOCH FROM timestamp)::INT AS firstUploadTime" " FROM context" " WHERE hostId = %s" " AND platformId = %s" " AND branchId = %s" " ORDER BY timestamp " + ("DESC" if reverse else "ASC"), (host_id, platform_id, branch_id)) return tuple(snapshots) # Retrieves the time series + additional stats of valid median results for # the given benchmark/metric combination. Only the part of the time series that # is within the selected snapshot interval is considered. # # Returns a 7-tuple: # # Component 1: An n-tuple of 6-tuples: # # ( # , # , # , # ( > 1) and (mean > 0) # ? ( [0, 1> ) # : -1, # , # # ) # # Component 2: Total number of invalid observations. # Component 3: Total number of non-positive observations. # Component 4: Median of all valid (i.e. non-negative) relative # standard errors (note: there is one RSE (valid or not) # per snapshot). # Component 5: Relative standard error of all valid observation medians # (note: there is zero or one such median per snapshot). # Component 6: Missing snaphots, i.e. the number of candidate snapshots # that are missing in the time series. # Component 7: Last snapshot distance, i.e. the gap between the last snapshot # in the time series and the last candidate snapshot. # def getTimeSeries( host_id, platform_id, branch_id, snapshots, benchmark_id, metric_id): contexts = [] for sha1_id, timestamp in snapshots: contexts.append(getContext(host_id, platform_id, branch_id, sha1_id)) # Fetch raw values: assert len(contexts) > 0 raw_values = (execQuery( "SELECT value, valid, contextId FROM result" " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ")" + " AND benchmarkId = %s" " AND metricId = %s" " ORDER BY contextId", tuple(contexts) + (benchmark_id, metric_id)) + [(-1, -1, -1)]) # Note sentinel item # Compute per-sample stats: curr_context_id = -1 sample = [] valid_and_positive_sample = [] valid_median_obs = [] ninvalid = 0 tot_ninvalid = 0 nzeros = 0 tot_nzeros = 0 rses = [] tsitem_map = {} # Loop over all observations (which are grouped on sample; # note the 1-1 correspondence between samples and contexts): for obs, valid, context_id in raw_values: if context_id != curr_context_id: # A new sample has been collected, so register it and # prepare for the next one: sample_size = len(sample) valid_and_positive_sample_size = len(valid_and_positive_sample) median_obs = -1 nrse = -1 if valid_and_positive_sample_size > 0: median_obs = stats.medianscore(valid_and_positive_sample) valid_median_obs.append(median_obs) if valid_and_positive_sample_size > 1: try: nrse = ( stats.sem(valid_and_positive_sample) / float(stats.mean(valid_and_positive_sample))) rses.append(100 * nrse) except ZeroDivisionError: pass tsitem_map[curr_context_id] = ( median_obs, sample_size, nrse, ninvalid, nzeros) sample = [] valid_and_positive_sample = [] tot_ninvalid = tot_ninvalid + ninvalid ninvalid = 0 tot_nzeros = tot_nzeros + nzeros nzeros = 0 curr_context_id = context_id # Append observation to current sample: sample.append(obs) if valid: if obs > 0: valid_and_positive_sample.append(obs) else: ninvalid = ninvalid + 1 if obs <= 0: nzeros = nzeros + 1 # Order chronologically: ts = [] index = 0 for context in contexts: if context in tsitem_map: tsitem = tsitem_map[context] ts.append( (index, tsitem[0], tsitem[1], tsitem[2], tsitem[3], tsitem[4])) index = index + 1 # Compute median of RSEs: if len(rses) > 0: median_of_rses = stats.medianscore(rses) else: median_of_rses = -1 # Compute RSE of valid median observations: if len(valid_median_obs) > 1: try: rse_of_medians = 100 * ( stats.sem(valid_median_obs) / float(stats.mean(valid_median_obs))) except ZeroDivisionError: rse_of_medians = -1 else: rse_of_medians = -1 ms = len(contexts) - len(ts) if len(ts) > 0: lsd = (len(contexts) - 1) - ts[-1][0] else: lsd = -1 return ( tuple(ts), tot_ninvalid, tot_nzeros, median_of_rses, rse_of_medians, ms, lsd) # Retrieves the time series of valid median results for the given # benchmark/metric combination within the given contexts. # Only the part of the time series that is within the selected snapshot # interval is considered. # # Returns an n-tuple of 2-tuples: # # ( # , # # ) # def getBasicTimeSeries(contexts, benchmark_id, metric_id): # Fetch raw values: assert len(contexts) > 0 raw_values = (execQuery( "SELECT value, valid, contextId FROM result" " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ")" + " AND benchmarkId = %s" " AND metricId = %s" " ORDER BY contextId", tuple(contexts) + (benchmark_id, metric_id)) + [(-1, -1, -1)]) # Note sentinel item # Compute per-sample stats: curr_context_id = -1 valid_and_positive_sample = [] median_obs_map = {} # Loop over all observations (which are grouped on sample; # note the 1-1 correspondence between samples and contexts): for obs, valid, context_id in raw_values: if context_id != curr_context_id: # A new sample has been collected, so register it and # prepare for the next one: median_obs = stats.medianscore(valid_and_positive_sample) if ( len(valid_and_positive_sample) > 0) else -1 median_obs_map[curr_context_id] = median_obs valid_and_positive_sample = [] curr_context_id = context_id # Append a valid and positive observation to the current sample: if valid and (obs > 0): valid_and_positive_sample.append(obs) # Order chronologically: ts = [] index = 0 for context in contexts: if context in median_obs_map: ts.append((index, median_obs_map[context])) index = index + 1 return ts # Returns the factor by which val improves over base_val by taking the # lower_is_better property into consideration. # Example: base_val = 10 and val = 20 results in 0.5 if lower_is_better is true, # and 2 if lower_is_better is false. def metricAdjustedRatio(base_val, val, lower_is_better): #assert val1 > 0 #assert val2 > 0 return (base_val / val) if lower_is_better else (val / base_val) # Locates (significant) changes in a time series. # Whether a change is significant depends on the difftol argument. # Only positive values are considered. # # The output is a list of 7-tuples, one per change: # # 1: Base index, i.e. the index in the time series that contains the base # value used to compute the change. # 2: Change index, i.e. the index in the time series at which the change # occurs. The value at this index is called the change value. # NOTE: The change index of change i becomes the base index of # change i + 1. # 3: Metric-adjusted change ratio, i.e. the factor by which the change # value improves over the base value. # # DEFINITIONS: # - Segment S1: The values in range [base index, change index>. # - Segment S2: The values in range [change index, next change index>. # (Or the end of the time series if there is no next change # index) # # 4: Global separation score. This measures how well all values before # the change index are separated from the values in segment S2 with # respect to the base value. # The score ranges from 0 (poor separation) to 1 (good separation). # If all target values lie on the far side of the base value # (as seen from the nearest value in segment S2), the score is 1. # If at least one target value lies on the far side of the nearest value # in segment S2 (as seen from the base value), the score is 0. # Otherwise, the score depends on the value that is furthest away # from the base value in the direction of the nearest value in segment S2. # # 5: Local separation score. A variant of global separation score that # measures how well the values in segment S1 and segment S2 are separated # from each other. # The score ranges from 0 (low separation) to 1 (high separation). # # 6: Durability score for S1. This measures the number of values in S1 # with respect to durtolmin and durtolmax. # The score ranges from 0 (few values; low durability) to # 1 (many values; high durability). # The score is 0 if S1 contains fewer than durtolmin values. # The score is 1 if S1 contains at least durtolmax values. # Otherwise, the score depends on the number of values in S1. # # 7: Durability score for S2. (Ditto) # def getChanges(time_series, lower_is_better, difftol, durtolmin, durtolmax): if len(time_series) == 0: return () # Define the difference tolerance range. # Ratios within [lo, hi] are considered insignificant. assert difftol >= 1 hi = difftol lo = 1.0 / difftol values = zip(*time_series)[1] # Use the first positive value as the first base: base_pos = -1 base_val = -1 for i in range(0, len(values)): base_val = values[i] if base_val > 0: base_pos = i break if base_pos == -1: return () # No positive values found! # Initialize global extremas: gmin, gmax = values[base_pos], values[base_pos] segments = [] base_ratio = -1 prev_gmin = -1 prev_gmax = -1 # Compute the segments (Note: The segments are divided by the changes, # so if no changes exist, there will be only one segment) while True: # Initialize stats for current segment: lmin, lmax = values[base_pos], values[base_pos] # Local extremas pvals = 1 # Number of positive values # Scan to next change if any: change_found = False for pos in range(base_pos + 1, len(values)): val = values[pos] if val > 0: # The value is positive, so this is a potential change. # Compute the local change as the metric-adjusted ratio: ratio = metricAdjustedRatio(base_val, val, lower_is_better) # Check if change is significant: if (ratio > hi) or (ratio < lo): # Finalize current segment and prepare for next change: segments.append( (base_pos, base_ratio, prev_gmin, prev_gmax, lmin, lmax, pvals)) base_pos = pos base_val = val base_ratio = ratio prev_gmin = gmin prev_gmax = gmax change_found = True else: # Update stats for current segment: if val < lmin: lmin = val elif val > lmax: lmax = val pvals = pvals + 1 # Update global extremas: if val < gmin: gmin = val elif val > gmax: gmax = val if change_found: break if not change_found: # No next change was found, so finalize the current segment # as the last segment ... segments.append( (base_pos, base_ratio, prev_gmin, prev_gmax, lmin, lmax, pvals)) break # Compute the changes ... changes = [] for i in range(1, len(segments)): s1 = segments[i - 1] s2 = segments[i] base_pos = s1[0] change_pos = s2[0] ratio = s2[1] # Compute global and local separation scores: base_val = values[base_pos] change_val = values[change_pos] gmin = s2[2] gmax = s2[3] lmin1 = s1[4] lmax1 = s1[5] lmin2 = s2[4] lmax2 = s2[5] if change_val < base_val: gsep_score = (gmin - lmax2) / float(base_val - change_val) lsep_score = (lmin1 - lmax2) / float(base_val - change_val) else: gsep_score = (lmin2 - gmax) / float(change_val - base_val) lsep_score = (lmin2 - lmax1) / float(change_val - base_val) gsep_score = min(max(gsep_score, 0), 1) lsep_score = min(max(lsep_score, 0), 1) # Compute durability scores: pvals1 = s1[6] pvals2 = s2[6] assert durtolmin > 0 assert durtolmin <= durtolmax sfact = 1.0 / (durtolmax - (durtolmin - 1)) dur_score1 = (pvals1 - (durtolmin - 1)) * sfact dur_score2 = (pvals2 - (durtolmin - 1)) * sfact dur_score1 = min(max(dur_score1, 0), 1) dur_score2 = min(max(dur_score2, 0), 1) changes.append( (base_pos, change_pos, ratio, gsep_score, lsep_score, dur_score1, dur_score2)) return tuple(changes) # Extracts the (significant) changes of all time series in a given # host/platform/branch combination. # # The output is a list of 3-tuples, one per time series having at least one # change: # # 1: Benchmark ID. # 2: Metric ID. # 3: The changes as a list of 9-tuples. The first 7 elements correspond to # the output from getChanges() (documented elsewhere). Element 8 # and 9 are the SHA1 ID and timestamp corresponding to the change. # def getAllChanges( host_id, platform_id, branch_id, difftol, durtolmin, durtolmax, progress_func = None, progress_arg = None): if progress_func != None: progress_func(0.0, progress_arg) # Get all snapshots matching the host/platform/branch combination: snapshots = getAllSnapshots(host_id, platform_id, branch_id) # Get the list of contexts that corresponds to these snapshots in this # host/platform/branch combination: contexts = [] for sha1_id, timestamp in snapshots: contexts.append(getContext(host_id, platform_id, branch_id, sha1_id)) if len(contexts) == 0: return [] # Get all distinct benchmark/metric combinations that match the # host/platform/branch context and are within the selected snapshot # interval. Each such combination corresponds to a time series. tseries_list = execQuery( "SELECT DISTINCT benchmarkId, metricId FROM result" " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ")", # " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ") LIMIT 10", contexts) changes = [] # Loop over time series: if progress_func != None: i = 0 for benchmark_id, metric_id in tseries_list: # Get the time series (without extra stats): time_series = getBasicTimeSeries(contexts, benchmark_id, metric_id) # Extract the significant changes: basic_tschanges = getChanges( time_series, metricIdToLowerIsBetter(metric_id), difftol, durtolmin, durtolmax) # ... add the SHA-1 and timestamp to each item: tschanges = [] for change in basic_tschanges: index = time_series[change[1]][0] sha1_id = snapshots[index][0] timestamp = snapshots[index][1] tschanges.append(change + (sha1_id, timestamp)) if len(tschanges) > 0: changes.append((benchmark_id, metric_id, tschanges)) if progress_func != None: i = i + 1 divisor = len(tseries_list) // 100 # report at most 100 times if (divisor > 0) and ((i % divisor) == 0): perc_done = (i / float(len(tseries_list))) * 100.0 progress_func(perc_done, progress_arg) if progress_func != None: progress_func(100.0, progress_arg) return changes # ### 2 B DOCUMENTED! def getTimeSeriesMiscStats(time_series, changes, snapshots, stats): if len(changes) > 0: stats["lc"] = changes[-1][2] lc_ts_pos = changes[-1][1] lc_ss_pos = time_series[lc_ts_pos][0] stats["lc_timestamp"] = snapshots[lc_ss_pos][1] stats["lc_distance"] = (len(snapshots) - 1) - lc_ss_pos stats["lc_gsep_score"] = changes[-1][3] stats["lc_lsep_score"] = changes[-1][4] stats["lc_dur1_score"] = changes[-1][5] stats["lc_dur2_score"] = changes[-1][6] else: stats["lc"] = -1 stats["lc_timestamp"] = -1 stats["lc_distance"] = -1 stats["lc_gsep_score"] = -1 stats["lc_lsep_score"] = -1 stats["lc_dur1_score"] = -1 stats["lc_dur2_score"] = -1 # Computes per-benchmark time series statistics. # # ADD MORE DOCS HERE ... 2 B DONE! # def getBMTimeSeriesStatsList( host_id, platform_id, branch_id, snapshots, test_case_filter, difftol, durtolmin, durtolmax, progress_func = None, progress_arg = None): if progress_func != None: progress_func(0.0, progress_arg) contexts = [] for sha1_id, timestamp in snapshots: contexts.append(getContext(host_id, platform_id, branch_id, sha1_id)) # Get all distinct benchmark/metric combinations that match the # host/platform/branch context and are within the selected snapshot # interval. Each such combination corresponds to a time series. assert len(contexts) > 0 bmark_metrics = execQuery( "SELECT DISTINCT benchmarkId, metricId FROM result" " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ")", contexts) bmstats_list = [] # Loop over time series: if progress_func != None: i = 0 #for benchmark_id, metric_id in bmark_metrics[800:810]: for benchmark_id, metric_id in bmark_metrics: benchmark = idToText("benchmark", benchmark_id) #if benchmark != "tst_qmetaobject:indexOfMethod(_q_columnsAboutToBeRemoved(QModelIndex,int,int))": # continue test_case, test_function, data_tag = ( benchmarkToComponents(benchmark)) # Skip this time series if it doesn't match the test case filter: if ((test_case_filter != None) and (not test_case in test_case_filter)): continue # Get the content and some basic stats of the time series: (time_series, tot_ninvalid, tot_nzeros, median_of_rses, rse_of_medians, ms, lsd) = getTimeSeries( host_id, platform_id, branch_id, snapshots, benchmark_id, metric_id) # Extract the significant changes: changes = getChanges( time_series, metricIdToLowerIsBetter(metric_id), difftol, durtolmin, durtolmax) stats = {} stats["benchmark"] = benchmark stats["benchmark_id"] = benchmark_id stats["metric"] = idToText("metric", metric_id) stats["metric_id"] = metric_id stats["lib"] = metricIdToLowerIsBetter(metric_id) stats["ms"] = ms stats["lsd"] = lsd stats["ni"] = tot_ninvalid stats["nz"] = tot_nzeros stats["nc"] = len(changes) stats["med_of_rses"] = median_of_rses stats["rse_of_meds"] = rse_of_medians getTimeSeriesMiscStats(time_series, changes, snapshots, stats) bmstats_list.append(stats) if progress_func != None: i = i + 1 divisor = len(bmark_metrics) // 100 # report at most 100 times if (divisor > 0) and ((i % divisor) == 0): perc_done = (i / float(len(bmark_metrics))) * 100.0 progress_func(perc_done, progress_arg) if progress_func != None: progress_func(100.0, progress_arg) return tuple(bmstats_list) # Returns the score for the most recent change in 'changes'. # If 'regression' is true, only regressions are considered, and otherwise # only improvements. # If 'premature' is true, the premature score (not considering post-change # durability) is returned instead of the regular score. # # Returns -1 if no score is found. # # OBSOLETE ??? def getLastChangeScore(changes, regression, premature): pass # Returns the highest score for any change in a certain time interval. # # The time interval is defined like this: <2 B DONE!> # # If 'regression' is true, only regressions are considered, and otherwise # only improvements. # If 'premature' is true, the premature score (not considering post-change # durability) is returned instead of the regular score. # # Returns -1 if no score is found. # # # OBSOLETE ??? def getHighestChangeScore(changes, regression, premature, days): pass # Computes change scores for each time series (benchmark/metric combination) # of the given host/platform/branch combination. # # Change types: # - most recent change (i.e. significant regression or improvement) # - strongest change last n days, for n in {7, 30, 180, and -1 (infinite)} # ... and all of these with and without inclusion of premature changes. # # ADD MORE DOCS HERE ... 2 B DONE! # # OBSOLETE ??? # def getChangeScores( # host_id, platform_id, branch_id, snapshots, test_case_filter, # difftol, durtolmin, durtolmax, progress_func = None, progress_arg = None): # if progress_func != None: # progress_func(0.0, progress_arg) # contexts = [] # for sha1_id, timestamp in snapshots: # contexts.append(getContext(host_id, platform_id, branch_id, sha1_id)) # # Get all distinct benchmark/metric combinations that match the # # host/platform/branch context and are within the selected snapshot # # interval. Each such combination corresponds to a time series. # assert len(contexts) > 0 # tseries = execQuery( # "SELECT DISTINCT benchmarkId, metricId FROM result" # " WHERE contextId IN (%s" + ", %s"*(len(contexts) - 1) + ")", # contexts) # scores = [] # # Loop over time series: # if progress_func != None: # i = 0 # #for benchmark_id, metric_id in tseries[800:810]: # for benchmark_id, metric_id in tseries: # benchmark = idToText("benchmark", benchmark_id) # #if benchmark != "tst_qmetaobject:indexOfMethod(_q_columnsAboutToBeRemoved(QModelIndex,int,int))": # # continue # test_case, test_function, data_tag = ( # benchmarkToComponents(benchmark)) # # Skip this time series if it doesn't match the test case filter: # if ((test_case_filter != None) # and (not test_case in test_case_filter)): # continue # # Get the time series (without extra stats): # time_series = getBasicTimeSeries( # host_id, platform_id, branch_id, snapshots, benchmark_id, metric_id) # # Extract the significant changes: # changes = getChanges( # time_series, metricIdToLowerIsBetter(metric_id), difftol, # durtolmin, durtolmax) # tsscores = {} # tsscores["benchmark_id"] = benchmark_id # tsscores["metric_id"] = metric_id # tsscores["regr_last"] = getLastChangeScore(changes, True, False) # tsscores["regr_last_pmt"] = getLastChangeScore(changes, True, True) # tsscores["impr_last"] = getLastChangeScore(changes, False, False) # tsscores["impr_last_pmt"] = getLastChangeScore(changes, False, True) # tsscores["regr_7"] = getHighestChangeScore( # changes, True, False, 7) # tsscores["regr_7_pmt"] = getHighestChangeScore( # changes, True, True, 7) # tsscores["impr_7"] = getHighestChangeScore( # changes, False, False, 7) # tsscores["impr_7_pmt"] = getHighestChangeScore( # changes, False, True, 7) # tsscores["regr_30"] = getHighestChangeScore( # changes, True, False, 30) # tsscores["regr_30_pmt"] = getHighestChangeScore( # changes, True, True, 30) # tsscores["impr_30"] = getHighestChangeScore( # changes, False, False, 30) # tsscores["impr_30_pmt"] = getHighestChangeScore( # changes, False, True, 30) # tsscores["regr_180"] = getHighestChangeScore( # changes, True, False, 180) # tsscores["regr_180_pmt"] = getHighestChangeScore( # changes, True, True, 180) # tsscores["impr_180"] = getHighestChangeScore( # changes, False, False, 180) # tsscores["impr_180_pmt"] = getHighestChangeScore( # changes, False, True, 180) # tsscores["regr_all"] = getHighestChangeScore( # changes, True, False, -1) # tsscores["regr_all_pmt"] = getHighestChangeScore( # changes, True, True, -1) # tsscores["impr_all"] = getHighestChangeScore( # changes, False, False, -1) # tsscores["impr_all_pmt"] = getHighestChangeScore( # changes, False, True, -1) # scores.append(tsscores) # if progress_func != None: # i = i + 1 # divisor = len(tseries) // 100 # report at most 100 times # if (divisor > 0) and ((i % divisor) == 0): # perc_done = (i / float(len(tseries))) * 100.0 # progress_func(perc_done, progress_arg) # if progress_func != None: # progress_func(100.0, progress_arg) # return scores # Returns True iff s is a valid SHA-1 string. def isValidSHA1(s): def containsOnlyHexDigits(s): return 0 not in [c in hexdigits for c in s] return (len(s) == 40) and containsOnlyHexDigits(s) def printJSONHeader(): print "Content-type: text/json\n" def printErrorAsJSON(error): printJSONHeader() print "{\"error\": \"" + error + "\"}\n" # Returns a 2-tuple consisting of: # 1: an option dictionary, and # 2: a flag that is true iff the QUERY_STRING environment variable is # present (i.e. that the script is invoked as a CGI-script for a # HTTP GET request). # # The option dictionary is extracted from either the QUERY_STRING environment # variable (first priority) or command-line arguments (second priority). # In the latter case, the options must be of the form # ... -- ... -- ... def getOptions(): def getOptDictFromQueryString(qs): options = {} for sq in qs.split("&"): keyval = sq.split("=") options[keyval[0]] = urllib.unquote(keyval[1]) return options def getOptDictFromCommandLine(): options = {} p = re.compile("^--(.+)$") key = None for arg in sys.argv[1:]: if key != None: options[key] = arg m = p.match(arg) if m: key = m.group(1) # Support "--help" as the only value-less option: if key == "help": options[key] = 1 key = None else: key = None return options qs = "QUERY_STRING" if qs in os.environ: return (getOptDictFromQueryString(os.environ[qs]), True) else: return (getOptDictFromCommandLine(), False)