summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMilian Wolff <milian.wolff@kdab.com>2017-03-29 16:33:53 +0200
committerMilian Wolff <milian.wolff@kdab.com>2017-03-29 14:52:53 +0000
commit62c48e4890eed1e4b4f0697b42a028d7d61b536a (patch)
treed074e827d97df9f0f70334dbad9f1d00caef7be6
parenta8ef632713b15a1d11a71b2eca3637e8d604c370 (diff)
Handle mmap and sample events in time order
According to the perf file format documentation, events can arrive in out-of-time-order. But no time-reordering should occur across a FINISHED_ROUND event. This pseudo-event was added to perf in Linux v2.6.35, released in 2010. But, sadly, it is only being used for non-tracing perf data files since Linux 3.17 from around 2014. Additionally, time order violations across the FINISHED_ROUND event can be observed even for data files obtained today. As such, we cannot always use FINISHED_ROUND to order our events. Instead, we use the following heuristic: - we buffer both samples and mmap events - when the combined size of both buffers exceeds a certain threshold, defined via the new --buffer-size CLI argument which defaults to 10MB, we flush the buffers: -- we sort samples and events by time -- then we iterate over the samples and handle all mmap events before the sample time -- we stop flushing the buffers when we handled half of the buffers by size, i.e. when we only have 5MB left - then we continue to buffer until we reach the threshold the next time - when we finish, we flush the full buffer This heuristic is only applied until we know that we can actually rely on FINISHED_ROUND events. This is the case when: - the user passed the CLI argument --buffer-size 0 - when we encountered a FINISHED_ROUND event - when we encountered a PerfFeatures event that tells us the perf that was used to record the data file is newer than 3.17 When we rely on FINISHED_ROUND events, we still only ever analyze half of our buffers, to work-around the upstream issues in perf that lead to time order violations across FINISHED_ROUND events. While somewhat complicated, this patch allows us to simplify the elf map significantly. This is done in the follow-up commits. The statistics mode is further extended, which shows us how this new behavior plays out in terms of memory consumption. For a normal perf file where we rely on the finished round events we observe the following values: ~~~~~~~~~~ samples: 20260 mmaps: 24331 rounds: 957 buffer flushes: 958 samples time violations: 0 mmaps time violations: 0 max samples per round: 75 max mmaps per round: 474 max samples per flush: 51 max mmaps per flush: 429 max buffer size: 847328 max total event size per round: 647040 max time: 1143246068673300 max time between rounds: 739096862 max reorder time: 738811755 ~~~~~~~~~~ For a perf file that was recorded with `-m 8192`, we instead see the following statistics. Note how the buffer size automatically is sized to fit the actual round size, and no time violations occur: ~~~~~~~~~~ samples: 20068 mmaps: 24341 rounds: 24 buffer flushes: 25 samples time violations: 0 mmaps time violations: 0 max samples per round: 3577 max mmaps per round: 6103 max samples per flush: 1893 max mmaps per flush: 2910 max buffer size: 31946624 max total event size per round: 30106704 max time: 1141236144717550 max time between rounds: 965150979 max reorder time: 964957358 ~~~~~~~~~~ When we parse a file in the streaming format, we cannot know at the beginning that we should rely on the finished round events. Then the statistics for a normal record look like this: ~~~~~~~~~~ samples: 20303 mmaps: 24339 rounds: 1029 buffer flushes: 1029 samples time violations: 0 mmaps time violations: 0 max samples per round: 98 max mmaps per round: 489 max samples per flush: 61 max mmaps per flush: 610 max buffer size: 1144496 max total event size per round: 863472 max time: 1143595838784853 max time between rounds: 554771976 max reorder time: 13834481 ~~~~~~~~~~ If the record was done with `-m 8192` we instead observe some time order violations at the beginning which could be work-arounded by passing a larger buffer size. Once we encounter the first finished round event, we follow those and do not suffer from time violations anymore: ~~~~~~~~~~ samples: 19854 mmaps: 24338 rounds: 21 buffer flushes: 24 samples time violations: 465 mmaps time violations: 395 max samples per round: 5115 max mmaps per round: 6316 max samples per flush: 2700 max mmaps per flush: 5364 max buffer size: 46458392 max total event size per round: 43698384 max time: 1143585885779204 max time between rounds: 912757434 max reorder time: 908114269 ~~~~~~~~~~ When we parse a perf.data file without FINISHED_ROUND events, we get for a normal file: ~~~~~~~~~~ samples: 20303 mmaps: 24339 rounds: 1 buffer flushes: 33 samples time violations: 0 mmaps time violations: 0 max samples per round: 20303 max mmaps per round: 24339 max samples per flush: 654 max mmaps per flush: 3004 max buffer size: 10494104 max total event size per round: 173458680 max time: 1143595838784853 max time between rounds: 0 max reorder time: 13834481 ~~~~~~~~~~ If the file has huge buffers, i.e. again `-m 8192` was passed to perf record, we instead see: ~~~~~~~~~~ samples: 19854 mmaps: 24338 rounds: 1 buffer flushes: 32 samples time violations: 4859 mmaps time violations: 3586 max samples per round: 19854 max mmaps per round: 24338 max samples per flush: 817 max mmaps per flush: 3089 max buffer size: 10493280 max total event size per round: 169569288 max time: 1143585885779204 max time between rounds: 0 max reorder time: 908114269 ~~~~~~~~~~ Change-Id: I756c4cccf75b4ce0179e965996f8b821bf60e3dd Reviewed-by: Ulf Hermann <ulf.hermann@qt.io>
-rw-r--r--app/main.cpp21
-rw-r--r--app/perfunwind.cpp169
-rw-r--r--app/perfunwind.h27
3 files changed, 179 insertions, 38 deletions
diff --git a/app/main.cpp b/app/main.cpp
index 8dc878c..41be9ab 100644
--- a/app/main.cpp
+++ b/app/main.cpp
@@ -157,6 +157,17 @@ int main(int argc, char *argv[])
"main", "Print statistics instead of converting the data."));
parser.addOption(printStats);
+ QCommandLineOption bufferSize(QLatin1String("buffer-size"),
+ QCoreApplication::translate(
+ "main", "Size of event buffer in kilobytes. This influences how"
+ " many events get buffered before they get sorted by time and "
+ " then analyzed. Increase this value when your perf.data file"
+ " was recorded with a large buffer value (perf record -m)."
+ " Pass 0 to buffer events until a FINISHED_ROUND event is "
+ " encountered. Default value is 10MB"),
+ QLatin1String("buffer-size"), QLatin1String("10240"));
+ parser.addOption(bufferSize);
+
parser.process(app);
QScopedPointer<QFile> outfile;
@@ -184,12 +195,20 @@ int main(int argc, char *argv[])
infile.reset(new PerfStdin);
}
+ bool ok = false;
+ uint maxEventBufferSize = parser.value(bufferSize).toUInt(&ok) * 1024;
+ if (!ok) {
+ qWarning() << "Failed to parse buffer-size argument. Expected unsigned integer, got:"
+ << parser.value(bufferSize);
+ return InvalidOption;
+ }
+
PerfUnwind unwind(outfile.data(), parser.value(sysroot), parser.isSet(debug) ?
parser.value(debug) : parser.value(sysroot) + parser.value(debug),
parser.value(extra), parser.value(appPath), parser.isSet(kallsymsPath)
? parser.value(kallsymsPath)
: parser.value(sysroot) + parser.value(kallsymsPath),
- parser.isSet(printStats));
+ parser.isSet(printStats), maxEventBufferSize);
PerfHeader header(infile.data());
PerfAttributes attributes;
diff --git a/app/perfunwind.cpp b/app/perfunwind.cpp
index af4fc9d..6330799 100644
--- a/app/perfunwind.cpp
+++ b/app/perfunwind.cpp
@@ -24,6 +24,7 @@
#include <QDebug>
#include <QtEndian>
+#include <QVersionNumber>
#include <cstring>
@@ -54,6 +55,9 @@ void PerfUnwind::Stats::addEventTime(quint64 time)
void PerfUnwind::Stats::finishedRound()
{
+ numSamples += numSamplesInRound;
+ numMmaps += numMmapsInRound;
+
maxSamplesPerRound = std::max(maxSamplesPerRound, numSamplesInRound);
maxMmapsPerRound = std::max(maxMmapsPerRound, numMmapsInRound);
numSamplesInRound = 0;
@@ -72,10 +76,10 @@ void PerfUnwind::Stats::finishedRound()
PerfUnwind::PerfUnwind(QIODevice *output, const QString &systemRoot, const QString &debugPath,
const QString &extraLibsPath, const QString &appPath,
- const QString &kallsymsPath, bool printStats) :
+ const QString &kallsymsPath, bool printStats, uint maxEventBufferSize) :
m_output(output), m_architecture(PerfRegisterInfo::ARCH_INVALID), m_systemRoot(systemRoot),
m_extraLibsPath(extraLibsPath), m_appPath(appPath), m_kallsyms(kallsymsPath),
- m_sampleBufferSize(0)
+ m_maxEventBufferSize(maxEventBufferSize), m_eventBufferSize(0), m_lastFlushMaxTime(0)
{
m_stats.enabled = printStats;
m_currentUnwind.unwind = this;
@@ -102,9 +106,7 @@ PerfUnwind::PerfUnwind(QIODevice *output, const QString &systemRoot, const QStri
PerfUnwind::~PerfUnwind()
{
finishedRound();
-
- foreach (const PerfRecordSample &sample, m_sampleBuffer)
- analyze(sample);
+ flushEventBuffer(0);
delete[] m_debugInfoPath;
qDeleteAll(m_symbolTables);
@@ -114,8 +116,13 @@ PerfUnwind::~PerfUnwind()
out << "samples: " << m_stats.numSamples << "\n";
out << "mmaps: " << m_stats.numMmaps << "\n";
out << "rounds: " << m_stats.numRounds << "\n";
+ out << "buffer flushes: " << m_stats.numBufferFlushes << "\n";
+ out << "samples time violations: " << m_stats.numTimeViolatingSamples << "\n";
+ out << "mmaps time violations: " << m_stats.numTimeViolatingMmaps << "\n";
out << "max samples per round: " << m_stats.maxSamplesPerRound << "\n";
out << "max mmaps per round: " << m_stats.maxMmapsPerRound << "\n";
+ out << "max samples per flush: " << m_stats.maxSamplesPerFlush << "\n";
+ out << "max mmaps per flush: " << m_stats.maxMmapsPerFlush << "\n";
out << "max buffer size: " << m_stats.maxBufferSize << "\n";
out << "max total event size per round: " << m_stats.maxTotalEventSizePerRound << "\n";
out << "max time: " << m_stats.maxTime << "\n";
@@ -139,15 +146,7 @@ Dwfl *PerfUnwind::dwfl(quint32 pid, quint64 timestamp)
void PerfUnwind::registerElf(const PerfRecordMmap &mmap)
{
- if (m_stats.enabled) {
- m_stats.totalEventSizePerRound += mmap.size();
- m_stats.addEventTime(mmap.time());
- ++m_stats.numMmapsInRound;
- ++m_stats.numMmaps;
- return;
- }
-
- symbolTable(mmap.pid())->registerElf(mmap, m_appPath, m_systemRoot, m_extraLibsPath);
+ bufferEvent(mmap, &m_mmapBuffer, &m_stats.numMmapsInRound);
}
void PerfUnwind::sendBuffer(const QByteArray &buffer)
@@ -226,6 +225,10 @@ void PerfUnwind::features(const PerfFeatures &features)
for (const auto &desc : eventDescs)
addAttributes(desc.attrs, desc.name, desc.ids);
+ const auto perfVersion = QVersionNumber::fromString(QString::fromLatin1(features.version()));
+ if (perfVersion >= QVersionNumber(3, 17))
+ m_maxEventBufferSize = 0;
+
QByteArray buffer;
QDataStream(&buffer, QIODevice::WriteOnly) << static_cast<quint8>(FeaturesDefinition)
<< features.hostName()
@@ -365,24 +368,7 @@ void PerfUnwind::resolveCallchain()
void PerfUnwind::sample(const PerfRecordSample &sample)
{
- m_sampleBuffer.append(sample);
- m_sampleBufferSize += sample.size();
-
- if (m_stats.enabled) {
- m_stats.maxBufferSize = std::max(m_sampleBufferSize, m_stats.maxBufferSize);
- m_stats.totalEventSizePerRound += sample.size();
- m_stats.addEventTime(sample.time());
- ++m_stats.numSamplesInRound;
- ++m_stats.numSamples;
- // don't return early, stats should include our buffer behavior
- }
-
- while (m_sampleBufferSize > s_maxSampleBufferSize) {
- const PerfRecordSample &sample = m_sampleBuffer.front();
- m_sampleBufferSize -= sample.size();
- analyze(sample);
- m_sampleBuffer.removeFirst();
- }
+ bufferEvent(sample, &m_sampleBuffer, &m_stats.numSamplesInRound);
}
void PerfUnwind::analyze(const PerfRecordSample &sample)
@@ -467,6 +453,15 @@ void PerfUnwind::sendSymbol(qint32 id, const PerfUnwind::Symbol &symbol)
sendBuffer(buffer);
}
+void PerfUnwind::sendError(ErrorCode error, const QString &message)
+{
+ qWarning() << error << message;
+ QByteArray buffer;
+ QDataStream(&buffer, QIODevice::WriteOnly) << static_cast<quint8>(Error)
+ << error << message;
+ sendBuffer(buffer);
+}
+
qint32 PerfUnwind::resolveString(const QByteArray& string)
{
if (string.isEmpty())
@@ -514,4 +509,114 @@ void PerfUnwind::finishedRound()
{
if (m_stats.enabled)
m_stats.finishedRound();
+
+ // when we parse a perf data stream we may not know whether it contains
+ // FINISHED_ROUND events. now we know, and thus we set the m_maxEventBufferSize
+ // to 0 to disable the heuristic there. Instead, we will now rely on the finished
+ // round events to tell us when to flush the event buffer
+ if (!m_maxEventBufferSize) {
+ // we only flush half of the events we got in this round
+ // this work-arounds bugs in upstream perf which leads to time order violations
+ // across FINISHED_ROUND events which should in theory never happen
+ flushEventBuffer(m_eventBufferSize / 2);
+ } else {
+ m_maxEventBufferSize = 0;
+ }
+}
+
+template<typename Event>
+void PerfUnwind::bufferEvent(const Event &event, QList<Event> *buffer, int *eventCounter)
+{
+ buffer->append(event);
+ m_eventBufferSize += event.size();
+
+ if (m_stats.enabled) {
+ *eventCounter += 1;
+ m_stats.maxBufferSize = std::max(m_eventBufferSize, m_stats.maxBufferSize);
+ m_stats.totalEventSizePerRound += event.size();
+ m_stats.addEventTime(event.time());
+ // don't return early, stats should include our buffer behavior
+ }
+
+ if (m_maxEventBufferSize && m_eventBufferSize > m_maxEventBufferSize)
+ flushEventBuffer(m_maxEventBufferSize / 2);
+}
+
+void PerfUnwind::flushEventBuffer(uint desiredBufferSize)
+{
+ auto sortByTime = [](const PerfRecord &lhs, const PerfRecord &rhs) {
+ return lhs.time() < rhs.time();
+ };
+ std::sort(m_mmapBuffer.begin(), m_mmapBuffer.end(), sortByTime);
+ std::sort(m_sampleBuffer.begin(), m_sampleBuffer.end(), sortByTime);
+
+ if (m_stats.enabled) {
+ for (const auto &sample : m_sampleBuffer) {
+ if (sample.time() < m_lastFlushMaxTime)
+ ++m_stats.numTimeViolatingSamples;
+ else
+ break;
+ }
+ for (const auto &mmap : m_mmapBuffer) {
+ if (mmap.time() < m_lastFlushMaxTime)
+ ++m_stats.numTimeViolatingMmaps;
+ else
+ break;
+ }
+ }
+
+ if (!m_mmapBuffer.isEmpty() && m_mmapBuffer.first().time() < m_lastFlushMaxTime) {
+ // when an mmap event is not following our desired time order, it can
+ // severly break our analysis. as such we report a real error in these cases
+ sendError(TimeOrderViolation,
+ tr("Time order violation of MMAP event across buffer flush detected. "
+ "Event time is %1, max time during last buffer flush was %2. "
+ "This potentially breaks the data analysis.")
+ .arg(m_mmapBuffer.first().time()).arg(m_lastFlushMaxTime));
+ }
+
+ auto mmapIt = m_mmapBuffer.begin();
+ auto mmapEnd = m_mmapBuffer.end();
+
+ auto sampleIt = m_sampleBuffer.begin();
+ auto sampleEnd = m_sampleBuffer.end();
+
+ for (; m_eventBufferSize > desiredBufferSize && sampleIt != sampleEnd; ++sampleIt) {
+ const auto &sample = *sampleIt;
+
+ if (sample.time() < m_lastFlushMaxTime) {
+ qWarning() << "Time order violation across buffer flush detected:"
+ << "Event time =" << sample.time() << ","
+ << "max time during last buffer flush = " << m_lastFlushMaxTime;
+ // we don't send an error for samples with broken times, since these
+ // are usually harmless and actually occur relatively often
+ // if desired, one can detect these issues on the client side anyways,
+ // based on the sample times
+ } else {
+ m_lastFlushMaxTime = sample.time();
+ }
+
+ for (; mmapIt != mmapEnd && mmapIt->time() <= sample.time(); ++mmapIt) {
+ if (!m_stats.enabled) {
+ symbolTable(mmapIt->pid())->registerElf(*mmapIt, m_appPath,
+ m_systemRoot,
+ m_extraLibsPath);
+ }
+ m_eventBufferSize -= mmapIt->size();
+ }
+
+ analyze(sample);
+ m_eventBufferSize -= sample.size();
+ }
+
+ if (m_stats.enabled) {
+ ++m_stats.numBufferFlushes;
+ const int samples = std::distance(m_sampleBuffer.begin(), sampleIt);
+ m_stats.maxSamplesPerFlush = std::max(samples, m_stats.maxSamplesPerFlush);
+ const int mmaps = std::distance(m_mmapBuffer.begin(), mmapIt);
+ m_stats.maxMmapsPerFlush = std::max(mmaps, m_stats.maxMmapsPerFlush);
+ }
+
+ m_sampleBuffer.erase(m_sampleBuffer.begin(), sampleIt);
+ m_mmapBuffer.erase(m_mmapBuffer.begin(), mmapIt);
}
diff --git a/app/perfunwind.h b/app/perfunwind.h
index b077ec8..b4c4859 100644
--- a/app/perfunwind.h
+++ b/app/perfunwind.h
@@ -58,6 +58,7 @@ public:
StringDefinition,
LostDefinition,
FeaturesDefinition,
+ Error,
InvalidType
};
@@ -104,7 +105,7 @@ public:
PerfUnwind(QIODevice *output, const QString &systemRoot, const QString &debugInfo,
const QString &extraLibs, const QString &appPath,
- const QString &kallsymsPath, bool printStats);
+ const QString &kallsymsPath, bool printStats, uint maxEventBufferSize);
~PerfUnwind();
PerfRegisterInfo::Architecture architecture() const { return m_architecture; }
@@ -179,6 +180,7 @@ private:
QString m_appPath;
QList<PerfRecordSample> m_sampleBuffer;
+ QList<PerfRecordMmap> m_mmapBuffer;
QHash<quint32, PerfSymbolTable *> m_symbolTables;
PerfKallsyms m_kallsyms;
@@ -188,16 +190,18 @@ private:
QHash<quint64, qint32> m_attributeIds;
QHash<PerfEventAttributes, qint32> m_attributes;
- uint m_sampleBufferSize;
-
- static const uint s_maxSampleBufferSize = 1024 * 1024;
+ uint m_maxEventBufferSize;
+ uint m_eventBufferSize;
+ quint64 m_lastFlushMaxTime;
struct Stats
{
Stats()
- : numSamples(0), numMmaps(0), numRounds(0),
+ : numSamples(0), numMmaps(0), numRounds(0), numBufferFlushes(0),
+ numTimeViolatingSamples(0), numTimeViolatingMmaps(0),
numSamplesInRound(0), numMmapsInRound(0),
maxSamplesPerRound(0), maxMmapsPerRound(0),
+ maxSamplesPerFlush(0), maxMmapsPerFlush(0),
maxBufferSize(0), maxTotalEventSizePerRound(0),
maxTime(0), maxTimeBetweenRounds(0), maxReorderTime(0),
lastRoundTime(0), totalEventSizePerRound(0),
@@ -210,10 +214,15 @@ private:
quint64 numSamples;
quint64 numMmaps;
quint64 numRounds;
+ quint64 numBufferFlushes;
+ quint64 numTimeViolatingSamples;
+ quint64 numTimeViolatingMmaps;
int numSamplesInRound;
int numMmapsInRound;
int maxSamplesPerRound;
int maxMmapsPerRound;
+ int maxSamplesPerFlush;
+ int maxMmapsPerFlush;
uint maxBufferSize;
uint maxTotalEventSizePerRound;
quint64 maxTime;
@@ -233,6 +242,14 @@ private:
void sendLocation(qint32 id, const Location &location);
void sendSymbol(qint32 id, const Symbol &symbol);
void sendAttributes(qint32 id, const PerfEventAttributes &attributes, const QByteArray &name);
+ enum ErrorCode {
+ TimeOrderViolation = 1
+ };
+ void sendError(ErrorCode error, const QString &message);
+
+ template<typename Event>
+ void bufferEvent(const Event &event, QList<Event> *buffer, int *eventCounter);
+ void flushEventBuffer(uint desiredBufferSize);
};
uint qHash(const PerfUnwind::Location &location, uint seed = 0);