blob: d9059c75725809b213d27a2f688009ae2b13de03 [file] [log] [blame] [view]
Primiano Tuccia6624852020-05-21 19:12:50 +01001# Trace Processor
2
3_The Trace Processor is a C++ library
4([/src/trace_processor](/src/trace_processor)) that ingests traces encoded in a
5wide variety of formats and exposes an SQL interface for querying trace events
6contained in a consistent set of tables. It also has other features including
7computation of summary metrics, annotating the trace with user-friendly
8descriptions and deriving new events from the contents of the trace._
9
10![Trace processor block diagram](/docs/images/trace-processor.png)
11
12## Quickstart
13
14The [quickstart](/docs/quickstart/trace-analysis.md) provides a quick overview
15on how to run SQL queries against traces using trace processor.
16
17## Introduction
18
19Events in a trace are optimized for fast, low-overhead recording. Therefore
20traces need significant data processing to extract meaningful information from
21them. This is compounded by the number of legacy formats which are still in use and
22need to be supported in trace analysis tools.
23
24The trace processor abstracts this complexity by parsing traces, extracting the
25data inside, and exposing it in a set of database tables which can be queried
26with SQL.
27
28Features of the trace processor include:
29
30* Execution of SQL queries on a custom, in-memory, columnar database backed by
31 the SQLite query engine.
32* Metrics subsystem which allows computation of summarized view of the trace
33 (e.g. CPU or memory usage of a process, time taken for app startup etc.).
34* Annotating events in the trace with user-friendly descriptions, providing
35 context and explanation of events to newer users.
36* Creation of new events derived from the contents of the trace.
37
38The formats supported by trace processor include:
39
40* Perfetto native protobuf format
41* Linux ftrace
42* Android systrace
43* Chrome JSON (including JSON embedding Android systrace text)
44* Fuchsia binary format
45* [Ninja](https://ninja-build.org/) logs (the build system)
46
47The trace processor is embedded in a wide variety of trace analysis tools, including:
48
49* [trace_processor](/docs/analysis/trace-processor.md), a standalone binary
50 providing a shell interface (and the reference embedder).
51* [Perfetto UI](https://ui.perfetto.dev), in the form of a WebAssembly module.
52* [Android Graphics Inspector](https://gpuinspector.dev/).
53* [Android Studio](https://developer.android.com/studio/).
54
55## Concepts
56
57The trace processor has some foundational terminology and concepts which are
58used in the rest of documentation.
59
60### Events
61
62In the most general sense, a trace is simply a collection of timestamped
63"events". Events can have associated metadata and context which allows them to
64be interpreted and analyzed.
65
66Events form the foundation of trace processor and are one of two types: slices
67and counters.
68
69#### Slices
70
71![Examples of slices](/docs/images/slices.png)
72
73A slice refers to an interval of time with some data describing what was
74happening in that interval. Some example of slices include:
75
76* Scheduling slices for each CPU
77* Atrace slices on Android
78* Userspace slices from Chrome
79
80#### Counters
81
82![Examples of counters](/docs/images/counters.png)
83
84A counter is a continuous value which varies over time. Some examples of
85counters include:
86
87* CPU frequency for each CPU core
88* RSS memory events - both from the kernel and polled from /proc/stats
89* atrace counter events from Android
90* Chrome counter events
91
92### Tracks
93
94A track is a named partition of events of the same type and the same associated
95context. For example:
96
97* Scheduling slices have one track for each CPU
98* Sync userspace slice have one track for each thread which emitted an event
99* Async userspace slices have one track for each cookie linking a set of async
100 events
101
102The most intuitive way to think of a track is to imagine how they would be drawn
103in a UI; if all the events are in a single row, they belong to the same track.
104For example, all the scheduling events for CPU 5 are on the same track:
105
106![CPU slices track](/docs/images/cpu-slice-track.png)
107
108Tracks can be split into various types based on the type of event they contain
109and the context they are associated with. Examples include:
110
111* Global tracks are not associated to any context and contain slices
112* Thread tracks are associated to a single thread and contain slices
113* Counter tracks are not associated to any context and contain counters
114* CPU counter tracks are associated to a single CPU and contain counters
115
116### Thread and process identifiers
117
118The handling of threads and processes needs special care when considered in the
119context of tracing; identifiers for threads and processes (e.g. `pid`/`tgid` and
120`tid` in Android/macOS/Linux) can be reused by the operating system over the
121course of a trace. This means they cannot be relied upon as a unique identifier
122when querying tables in trace processor.
123
124To solve this problem, the trace processor uses `utid` (_unique_ tid) for
125threads and `upid` (_unique_ pid) for processes. All references to threads and
126processes (e.g. in CPU scheduling data, thread tracks) uses `utid` and `upid`
127instead of the system identifiers.
128
129## Object-oriented tables
130
131Modeling an object with many types is a common problem in trace processor. For
132example, tracks can come in many varieties (thread tracks, process tracks,
133counter tracks etc). Each type has a piece of data associated to it unique to
134that type; for example, thread tracks have a `utid` of the thread, counter
135tracks have the `unit` of the counter.
136
137To solve this problem in object-oriented languages, a `Track` class could be
138created and inheritance used for all subclasses (e.g. `ThreadTrack` and
139`CounterTrack` being subclasses of `Track`, `ProcessCounterTrack` being a
140subclass of `CounterTrack` etc).
141
142![Object-oriented table diagram](/docs/images/oop-table-inheritance.png)
143
144In trace processor, this "object-oriented" approach is replicated by having
145different tables for each type of object. For example, we have a `track` table
146as the "root" of the hierarchy with the `thread_track` and `counter_track`
147tables "inheriting from" the `track` table.
148
149NOTE: [The appendix below](#appendix-table-inheritance) gives the exact rules
150for inheritance between tables for interested readers.
151
152Inheritance between the tables works in the natural way (i.e. how it works in
153OO languages) and is best summarized by a diagram.
154
155![SQL table inheritance diagram](/docs/images/tp-table-inheritance.png)
156
157NOTE: For an up-to-date of how tables currently inherit from each other as well
158as a comprehensive reference of all the column and how they are inherited see
159the [SQL tables](/docs/analysis/sql-tables.autogen) reference page.
160
161## Writing Queries
162
163### Context using tracks
164
165A common question when querying tables in trace processor is: "how do I obtain
166the process or thread for a slice?". Phrased more generally, the question is
167"how do I get the context for an event?".
168
169In trace processor, any context associated with all events on a track is found
170on the associated `track` tables.
171
172For example, to obtain the `utid` of any thread which emitted a `measure` slice
173
174```sql
175SELECT utid
176FROM slice
177JOIN thread_track ON thread_track.id = slice.track_id
178WHERE slice.name = 'measure'
179```
180
181Similarly, to obtain the `upid`s of any process which has a `mem.swap` counter
182greater than 1000
183
184```sql
185SELECT upid
186FROM counter
187JOIN process_counter_track ON process_counter_track.id = slice.track_id
188WHERE process_counter_track.name = 'mem.swap' AND value > 1000
189```
190
191If the source and type of the event is known beforehand (which is generally the
192case), the following can be used to find the `track` table to join with
193
194| Event type | Associated with | Track table | Constraint in WHERE clause |
195| :--------- | ------------------ | --------------------- | -------------------------- |
196| slice | N/A (global scope) | track | `type = 'track'` |
197| slice | thread | thread_track | N/A |
198| slice | process | process_track | N/A |
199| counter | N/A (global scope) | counter_track | `type = 'counter_track'` |
200| counter | thread | thread_counter_track | N/A |
201| counter | process | process_counter_track | N/A |
202| counter | cpu | cpu_counter_track | N/A |
203
204On the other hand, sometimes the source is not known. In this case, joining with
205the `track `table and looking up the `type` column will give the exact track
206table to join with.
207
208For example, to find the type of track for `measure` events, the following query
209could be used.
210
211```sql
212SELECT type
213FROM slice
214JOIN track ON track.id = slice.track_id
215WHERE slice.name = 'measure'
216```
217
218### Thread and process tables
219
220While obtaining `utid`s and `upid`s are a step in the right direction, generally
221users want the original `tid`, `pid`, and process/thread names.
222
223The `thread` and `process` tables map `utid`s and `upid`s to threads and
224processes respectively. For example, to lookup the thread with `utid` 10
225
226```sql
227SELECT tid, name
228FROM thread
229WHERE utid = 10
230```
231
232The `thread` and `process` tables can also be joined with the associated track
233tables directly to jump directly from the slice or counter to the information
234about processes and threads.
235
236For example, to get a list of all the threads which emitted a `measure` slice
237
238```sql
239SELECT thread.name AS thread_name
240FROM slice
241JOIN thread_track ON slice.track_id = thread_track.id
242JOIN thread USING(utid)
243WHERE slice.name = 'measure'
244GROUP BY thread_name
245```
246
Lalit Magantifa455ba2021-05-17 16:25:30 +0100247## Helper functions
248Helper functions are functions built into C++ which reduce the amount of
249boilerplate which needs to be written in SQL.
250
251### Extract args
252`EXTRACT_ARG` is a helper function which retreives a property of an
253event (e.g. slice or counter) from the `args` table.
254
255It takes an `arg_set_id` and `key` as input and returns the value looked
256up in the `args` table.
257
258For example, to retrieve the `prev_comm` field for `sched_switch` events in
259the `raw` table.
260```sql
261SELECT EXTRACT_ARG(arg_set_id, 'prev_comm')
262FROM raw
263WHERE name = 'sched_switch'
264```
265
266Behind the scenes, the above query would desugar to the following:
267```sql
268SELECT
269 (
270 SELECT string_value
271 FROM args
272 WHERE key = 'prev_comm' AND args.arg_set_id = raw.arg_set_id
273 )
274FROM raw
275WHERE name = 'sched_switch'
276```
277
278NOTE: while convinient, `EXTRACT_ARG` can inefficient compared to a `JOIN`
279when working with very large tables; a function call is required for every
280row which will be slower than the batch filters/sorts used by `JOIN`.
281
Lalit Maganti9ae8a742020-07-01 12:58:29 +0100282## Operator tables
Primiano Tucci66b08c32020-07-23 12:37:48 +0200283SQL queries are usually sufficient to retrieve data from trace processor.
Lalit Maganti9ae8a742020-07-01 12:58:29 +0100284Sometimes though, certain constructs can be difficult to express pure SQL.
285
286In these situations, trace processor has special "operator tables" which solve
287a particular problem in C++ but expose an SQL interface for queries to take
288advantage of.
289
290### Span join
291Span join is a custom operator table which computes the intersection of
292spans of time from two tables or views. A column (called the *partition*)
293can optionally be specified which divides the rows from each table into
294partitions before computing the intersection.
295
296![Span join block diagram](/docs/images/span-join.png)
297
298```sql
299-- Get all the scheduling slices
300CREATE VIEW sp_sched AS
301SELECT ts, dur, cpu, utid
302FROM sched
303
304-- Get all the cpu frequency slices
305CREATE VIEW sp_frequency AS
306SELECT
307 ts,
308 lead(ts) OVER (PARTITION BY cpu ORDER BY ts) - ts as dur,
309 cpu,
310 value as freq
311FROM counter
312
313-- Create the span joined table which combines cpu frequency with
314-- scheduling slices.
315CREATE VIRTUAL TABLE sched_with_frequency
316USING SPAN_JOIN(sp_sched PARTITIONED cpu, sp_frequency PARTITIONED cpu)
317
318-- This span joined table can be queried as normal and has the columns from both
319-- tables.
320SELECT ts, dur, cpu, utid, freq
321FROM sched_with_frequency
322```
323
324NOTE: A partition can be specified on neither, either or both tables. If
325specified on both, the same column name has to be specified on each table.
326
327WARNING: An important restriction on span joined tables is that spans from
328the same table in the same partition *cannot* overlap. For performance
Dan Elphick0e8e0572020-12-10 11:07:48 +0000329reasons, span join does not attempt to detect and error out in this situation;
Lalit Maganti9ae8a742020-07-01 12:58:29 +0100330instead, incorrect rows will silently be produced.
331
Stephen Nuskoe5d071f2020-07-02 13:20:17 +0100332### Ancestor slice
333ancestor_slice is a custom operator table that takes a
Lalit Maganti0b810072020-07-08 12:00:45 +0100334[slice table's id column](/docs/analysis/sql-tables.autogen#slice) and computes
335all slices on the same track that are direct parents above that id (i.e. given
336a slice id it will return as rows all slices that can be found by following
337the parent_id column to the top slice (depth = 0)).
Stephen Nuskoe5d071f2020-07-02 13:20:17 +0100338
339The returned format is the same as the
Lalit Maganti0b810072020-07-08 12:00:45 +0100340[slice table](/docs/analysis/sql-tables.autogen#slice)
Stephen Nuskoe5d071f2020-07-02 13:20:17 +0100341
342For example, the following finds the top level slice given a bunch of slices of
343interest.
344
345```sql
346CREATE VIEW interesting_slices AS
347SELECT id, ts, dur, track_id
348FROM slice WHERE name LIKE "%interesting slice name%";
349
350SELECT
351 *
352FROM
353 interesting_slices LEFT JOIN
354 ancestor_slice(interesting_slices.id) AS ancestor ON ancestor.depth = 0
355```
356
Stephen Nusko2b8d8092020-07-02 13:24:30 +0100357### Descendant slice
358descendant_slice is a custom operator table that takes a
Lalit Maganti0b810072020-07-08 12:00:45 +0100359[slice table's id column](/docs/analysis/sql-tables.autogen#slice) and
360computes all slices on the same track that are nested under that id (i.e.
361all slices that are on the same track at the same time frame with a depth
362greater than the given slice's depth.
Stephen Nusko2b8d8092020-07-02 13:24:30 +0100363
364The returned format is the same as the
Lalit Maganti0b810072020-07-08 12:00:45 +0100365[slice table](/docs/analysis/sql-tables.autogen#slice)
Stephen Nusko2b8d8092020-07-02 13:24:30 +0100366
367For example, the following finds the number of slices under each slice of
368interest.
369
370```sql
371CREATE VIEW interesting_slices AS
372SELECT id, ts, dur, track_id
373FROM slice WHERE name LIKE "%interesting slice name%";
374
375SELECT
376 *
377 (
378 SELECT
379 COUNT(*) AS total_descendants
380 FROM descendant_slice(interesting_slice.id)
381 )
382FROM interesting_slices
383```
384
ssid44f123f2021-01-08 16:04:41 -0800385### Connected/Following/Preceding flows
Andrii824107d2020-09-29 10:13:36 +0300386
ssid44f123f2021-01-08 16:04:41 -0800387DIRECTLY_CONNECTED_FLOW, FOLLOWING_FLOW and PRECEDING_FLOW are custom operator
388tables that take a
389[slice table's id column](/docs/analysis/sql-tables.autogen#slice) and collect
390all entries of [flow table](/docs/analysis/sql-tables.autogen#flow), that are
391directly or indirectly connected to the given starting slice.
Andrii824107d2020-09-29 10:13:36 +0300392
ssid44f123f2021-01-08 16:04:41 -0800393`DIRECTLY_CONNECTED_FLOW(start_slice_id)` - contains all entries of
394[flow table](/docs/analysis/sql-tables.autogen#flow) that are present in any
395chain of kind: `flow[0] -> flow[1] -> ... -> flow[n]`, where
396`flow[i].slice_out = flow[i+1].slice_in` and `flow[0].slice_out = start_slice_id
397OR start_slice_id = flow[n].slice_in`.
Andrii824107d2020-09-29 10:13:36 +0300398
ssid44f123f2021-01-08 16:04:41 -0800399NOTE: Unlike the following/preceding flow functions, this function will not
400include flows connected to ancestors or descendants while searching for flows
401from a slice. It only includes the slices in the directly connected chain.
402
403`FOLLOWING_FLOW(start_slice_id)` - contains all flows which can be reached from
404a given slice via recursively following from flow's outgoing slice to its
405incoming one and from a reached slice to its child. The return table contains
406all entries of [flow table](/docs/analysis/sql-tables.autogen#flow) that are
407present in any chain of kind: `flow[0] -> flow[1] -> ... -> flow[n]`, where
408`flow[i+1].slice_out IN DESCENDANT_SLICE(flow[i].slice_in) OR
409flow[i+1].slice_out = flow[i].slice_in` and `flow[0].slice_out IN
410DESCENDANT_SLICE(start_slice_id) OR flow[0].slice_out = start_slice_id`.
411
412`PRECEDING_FLOW(start_slice_id)` - contains all flows which can be reached from
413a given slice via recursively following from flow's incoming slice to its
414outgoing one and from a reached slice to its parent. The return table contains
415all entries of [flow table](/docs/analysis/sql-tables.autogen#flow) that are
416present in any chain of kind: `flow[n] -> flow[n-1] -> ... -> flow[0]`, where
417`flow[i].slice_in IN ANCESTOR_SLICE(flow[i+1].slice_out) OR flow[i].slice_in =
418flow[i+1].slice_out` and `flow[0].slice_in IN ANCESTOR_SLICE(start_slice_id) OR
419flow[0].slice_in = start_slice_id`.
Andrii824107d2020-09-29 10:13:36 +0300420
421```sql
422--number of following flows for each slice
423SELECT (SELECT COUNT(*) FROM FOLLOWING_FLOW(slice_id)) as following FROM slice;
424```
Stephen Nusko2b8d8092020-07-02 13:24:30 +0100425
Primiano Tuccia6624852020-05-21 19:12:50 +0100426## Metrics
427
428TIP: To see how to add to add a new metric to trace processor, see the checklist
429[here](/docs/contributing/common-tasks.md#new-metric).
430
431The metrics subsystem is a significant part of trace processor and thus is
432documented on its own [page](/docs/analysis/metrics.md).
433
434## Annotations
435
436TIP: To see how to add to add a new annotation to trace processor, see the
437checklist [here](/docs/contributing/common-tasks.md#new-annotation).
438
439Annotations attach a human-readable description to a slice in the trace. This
440can include information like the source of a slice, why a slice is important and
441links to documentation where the viewer can learn more about the slice.
442In essence, descriptions act as if an expert was telling the user what the slice
443means.
444
445For example, consider the `inflate` slice which occurs during view inflation in
446Android. We can add the following description and link:
447
448**Description**: Constructing a View hierarchy from pre-processed XML via
449LayoutInflater#layout. This includes constructing all of the View objects in the
450hierarchy, and applying styled attributes.
451
452## Creating derived events
453
454TIP: To see how to add to add a new annotation to trace processor, see the
455 checklist [here](/docs/contributing/common-tasks.md#new-annotation).
456
457This feature allows creation of new events (slices and counters) from the data
458in the trace. These events can then be displayed in the UI tracks as if they
459were part of the trace itself.
460
461This is useful as often the data in the trace is very low-level. While low
462level information is important for experts to perform deep debugging, often
463users are just looking for a high level overview without needing to consider
464events from multiple locations.
465
466For example, an app startup in Android spans multiple components including
467`ActivityManager`, `system_server`, and the newly created app process derived
468from `zygote`. Most users do not need this level of detail; they are only
469interested in a single slice spanning the entire startup.
470
471Creating derived events is tied very closely to
472[metrics subsystem](/docs/analysis/metrics.md); often SQL-based metrics need to
473create higher-level abstractions from raw events as intermediate artifacts.
474
475From previous example, the
476[startup metric](/src/trace_processor/metrics/android/android_startup.sql)
477creates the exact `launching` slice we want to display in the UI.
478
479The other benefit of aligning the two is that changes in metrics are
480automatically kept in sync with what the user sees in the UI.
481
482## Alerts
483
484Alerts are used to draw the attention of the user to interesting parts of the
485trace; this are usually warnings or errors about anomalies which occurred in the
486trace.
487
488Currently, alerts are not implemented in the trace processor but the API to
489create derived events was designed with them in mind. We plan on adding another
490column `alert_type` (name to be finalized) to the annotations table which can
491have the value `warning`, `error` or `null`. Depending on this value, the
492Perfetto UI will flag these events to the user.
493
494NOTE: we do not plan on supporting case where alerts need to be added to
495 existing events. Instead, new events should be created using annotations
496 and alerts added on these instead; this is because the trace processor
497 storage is monotonic-append-only.
498
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000499## Python API
500
501The trace processor Python API is built on the existing HTTP interface of `trace processor`
502and is available as part of the standalone build. The API allows you to load in traces and
503query tables and run metrics without requiring the `trace_processor` binary to be
504downloaded or installed.
505
506### Setup
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000507```
Anindita Ghosh456261f2020-09-07 11:31:24 +0100508pip install perfetto
509```
510NOTE: The API is only compatible with Python3.
511
512```python
513from perfetto.trace_processor import TraceProcessor
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000514# Initialise TraceProcessor with a trace file
Carmen Jacksonaad2d912021-03-15 10:22:29 -0700515tp = TraceProcessor(file_path='trace.perfetto-trace')
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000516```
517
518NOTE: The TraceProcessor can be initialized in a combination of ways including:
519 <br> - An address at which there exists a running instance of `trace_processor` with a
520 loaded trace (e.g. `TraceProcessor(addr='localhost:9001')`)
521 <br> - An address at which there exists a running instance of `trace_processor` and
522 needs a trace to be loaded in
Carmen Jacksonaad2d912021-03-15 10:22:29 -0700523 (e.g. `TraceProcessor(addr='localhost:9001', file_path='trace.perfetto-trace')`)
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000524 <br> - A path to a `trace_processor` binary and the trace to be loaded in
Carmen Jacksonaad2d912021-03-15 10:22:29 -0700525 (e.g. `TraceProcessor(bin_path='./trace_processor', file_path='trace.perfetto-trace')`)
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000526
527
528### API
529
530The `trace_processor.api` module contains the `TraceProcessor` class which provides various
531functions that can be called on the loaded trace. For more information on how to use
532these functions, see this [`example`](/src/trace_processor/python/example.py).
533
534#### Query
535The query() function takes an SQL query as input and returns an iterator through the rows
536of the result.
537
Lalit Maganti7d9dbc92020-08-17 20:40:04 +0100538```python
Anindita Ghosh456261f2020-09-07 11:31:24 +0100539from perfetto.trace_processor import TraceProcessor
Carmen Jacksonaad2d912021-03-15 10:22:29 -0700540tp = TraceProcessor(file_path='trace.perfetto-trace')
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000541
542qr_it = tp.query('SELECT ts, dur, name FROM slice')
543for row in qr_it:
544 print(row.ts, row.dur, row.name)
545```
546**Output**
547```
548261187017446933 358594 eglSwapBuffersWithDamageKHR
549261187017518340 357 onMessageReceived
550261187020825163 9948 queueBuffer
551261187021345235 642 bufferLoad
552261187121345235 153 query
553...
554```
555The QueryResultIterator can also be converted to a Pandas DataFrame, although this
556requires you to have both the `NumPy` and `Pandas` modules installed.
Lalit Maganti7d9dbc92020-08-17 20:40:04 +0100557```python
Anindita Ghosh456261f2020-09-07 11:31:24 +0100558from perfetto.trace_processor import TraceProcessor
Carmen Jacksonaad2d912021-03-15 10:22:29 -0700559tp = TraceProcessor(file_path='trace.perfetto-trace')
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000560
561qr_it = tp.query('SELECT ts, dur, name FROM slice')
Anindita Ghosh3b19f412020-09-10 22:43:15 +0000562qr_df = qr_it.as_pandas_dataframe()
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000563print(qr_df.to_string())
564```
565**Output**
566```
567ts dur name
568-------------------- -------------------- ---------------------------
569 261187017446933 358594 eglSwapBuffersWithDamageKHR
570 261187017518340 357 onMessageReceived
571 261187020825163 9948 queueBuffer
572 261187021345235 642 bufferLoad
573 261187121345235 153 query
574 ...
575```
576Furthermore, you can use the query result in a Pandas DataFrame format to easily
577make visualisations from the trace data.
Lalit Maganti7d9dbc92020-08-17 20:40:04 +0100578```python
Anindita Ghosh456261f2020-09-07 11:31:24 +0100579from perfetto.trace_processor import TraceProcessor
Carmen Jacksonaad2d912021-03-15 10:22:29 -0700580tp = TraceProcessor(file_path='trace.perfetto-trace')
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000581
582qr_it = tp.query('SELECT ts, value FROM counter WHERE track_id=50')
Anindita Ghosh3b19f412020-09-10 22:43:15 +0000583qr_df = qr_it.as_pandas_dataframe()
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000584qr_df = qr_df.replace(np.nan,0)
585qr_df = qr_df.set_index('ts')['value'].plot()
586```
587**Output**
588
Anindita Ghosh456261f2020-09-07 11:31:24 +0100589![Graph made frpm the query results](/docs/images/example_pd_graph.png)
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000590
591
592#### Metric
593The metric() function takes in a list of trace metrics and returns the results as a Protobuf.
594
Anindita Ghosh456261f2020-09-07 11:31:24 +0100595```python
596from perfetto.trace_processor import TraceProcessor
Carmen Jacksonaad2d912021-03-15 10:22:29 -0700597tp = TraceProcessor(file_path='trace.perfetto-trace')
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000598
599ad_cpu_metrics = tp.metric(['android_cpu'])
600print(ad_cpu_metrics)
601```
602**Output**
603```
604metrics {
605 android_cpu {
606 process_info {
607 name: "/system/bin/init"
608 threads {
609 name: "init"
610 core {
611 id: 1
612 metrics {
613 mcycles: 1
614 runtime_ns: 570365
615 min_freq_khz: 1900800
616 max_freq_khz: 1900800
617 avg_freq_khz: 1902017
618 }
619 }
620 core {
621 id: 3
622 metrics {
623 mcycles: 0
624 runtime_ns: 366406
625 min_freq_khz: 1900800
626 max_freq_khz: 1900800
627 avg_freq_khz: 1902908
628 }
629 }
630 ...
631 }
632 ...
633 }
634 process_info {
635 name: "/system/bin/logd"
636 threads {
637 name: "logd.writer"
638 core {
639 id: 0
640 metrics {
641 mcycles: 8
642 runtime_ns: 33842357
643 min_freq_khz: 595200
644 max_freq_khz: 1900800
645 avg_freq_khz: 1891825
646 }
647 }
648 core {
649 id: 1
650 metrics {
651 mcycles: 9
652 runtime_ns: 36019300
653 min_freq_khz: 1171200
654 max_freq_khz: 1900800
655 avg_freq_khz: 1887969
656 }
657 }
658 ...
659 }
660 ...
661 }
662 ...
663 }
664}
665```
666
667### HTTP
668The `trace_processor.http` module contains the `TraceProcessorHttp` class which
669provides methods to make HTTP requests to an address at which there already
670exists a running instance of `trace_processor` with a trace loaded in. All
671results are returned in Protobuf format
Lalit Maganti7d9dbc92020-08-17 20:40:04 +0100672(see [`trace_processor_proto`](/protos/perfetto/trace_processor/trace_processor.proto)).
Anindita Ghoshd2a9f662020-08-17 18:11:08 +0000673Some functions include:
674* `execute_query()` - Takes in an SQL query and returns a `QueryResult` Protobuf
675 message
676* `compute_metric()` - Takes in a list of trace metrics and returns a
677 `ComputeMetricResult` Protobuf message
678* `status()` - Returns a `StatusResult` Protobuf message
679
680
Lalit Magantic0f23a12020-07-17 12:55:49 +0100681## Testing
682
683Trace processor is mainly tested in two ways:
6841. Unit tests of low-level building blocks
6852. "Diff" tests which parse traces and check the output of queries
686
687### Unit tests
688Unit testing trace processor is the same as in other parts of Perfetto and
689other C++ projects. However, unlike the rest of Perfetto, unit testing is
690relatively light in trace processor.
691
692We have discovered over time that unit tests are generally too brittle
693when dealing with code which parses traces leading to painful, mechanical
694changes being needed when refactorings happen.
695
696Because of this, we choose to focus on diff tests for most areas (e.g.
697parsing events, testing schema of tables, testing metrics etc.) and only
698use unit testing for the low-level building blocks on which the rest of
699trace processor is built.
700
701### Diff tests
702Diff tests are essentially integration tests for trace processor and the
703main way trace processor is tested.
704
705Each diff test takes as input a) a trace file b) a query file *or* a metric
706name. It runs `trace_processor_shell` to parse the trace and then executes
707the query/metric. The result is then compared to a 'golden' file and any
708difference is highlighted.
709
710All diff tests are organized under [test/trace_processor](/test/trace_processor)
711and are run by the script
712[`tools/diff_test_trace_processor.py`](/tools/diff_test_trace_processor.py).
713New tests can be added with the helper script
714[`tools/add_tp_diff_test.py`](/tools/add_tp_diff_test.py).
715
716NOTE: `trace_processor_shell` and associated proto descriptors needs to be
717built before running `tools/diff_test_trace_processor.py`. The easiest way
718to do this is to run `tools/ninja -C <out directory>` both initially and on
719every change to trace processor code or builtin metrics.
720
721#### Choosing where to add diff tests
722When adding a new test with `tools/add_tp_diff_test.py`, the user is
723prompted for a folder to add the new test to. Often this can be confusing
724as a test can fall into more than one category. This section is a guide
725to decide which folder to choose.
726
727Broadly, there are two categories which all folders fall into:
7281. __"Area" folders__ which encompass a "vertical" area of interest
729 e.g. startup/ contains Android app startup related tests or chrome/
730 contains all Chrome related tests.
7312. __"Feature" folders__ which encompass a particular feature of
732 trace processor e.g. process_tracking/ tests the lifetime tracking of
733 processes, span_join/ tests the span join operator.
734
735"Area" folders should be preferred for adding tests unless the test is
736applicable to more than one "area"; in this case, one of "feature" folders
737can be used instead.
738
739Here are some common scenarios in which new tests may be added and
740answers on where to add the test:
741
742__Scenario__: A new event is being parsed, the focus of the test is to ensure
743the event is being parsed correctly and the event is focused on a single
744vertical "Area".
745
746_Answer_: Add the test in one of the "Area" folders.
747
748__Scenario__: A new event is being parsed and the focus of the test is to ensure
749the event is being parsed correctly and the event is applicable to more than one
750vertical "Area".
751
752_Answer_: Add the test to the parsing/ folder.
753
754__Scenario__: A new metric is being added and the focus of the test is to
755ensure the metric is being correctly computed.
756
757_Answer_: Add the test in one of the "Area" folders.
758
759__Scenario__: A new dynamic table is being added and the focus of the test is to
760ensure the dynamic table is being correctly computed...
761
762_Answer_: Add the test to the dynamic/ folder
763
764__Scenario__: The interals of trace processor are being modified and the test
765is to ensure the trace processor is correctly filtering/sorting important
766built-in tables.
767
768_Answer_: Add the test to the tables/ folder.
769
770
Primiano Tuccia6624852020-05-21 19:12:50 +0100771## Appendix: table inheritance
772
773Concretely, the rules for inheritance between tables works are as follows:
774
775* Every row in a table has an `id` which is unique for a hierarchy of tables.
776 * For example, every `track` will have an `id` which is unique among all
777 tracks (regardless of the type of track)
778* If a table C inherits from P, each row in C will also be in P _with the same
779 id_
780 * This allows for ids to act as "pointers" to rows; lookups by id can be
781 performed on any table which has that row
782 * For example, every `process_counter_track` row will have a matching row in
783 `counter_track` which will itself have matching rows in `track`
784* If a table C with columns `A` and `B` inherits from P with column `A`, `A`
785 will have the same data in both C and P
786 * For example, suppose
787 * `process_counter_track` has columns `name`, `unit` and `upid`
788 * `counter_track` has `name` and `unit`
789 * `track` has `name`
790 * Every row in `process_counter_track` will have the same `name` for the row
791 with the same id in `track` and `counter_track`
792 * Similarly, every row in `process_counter_track` will have both the same
793 `name ` and `unit` for the row with the same id in `counter_track`
794* Every row in a table has a `type` column. This specifies the _most specific_
795 table this row belongs to.
796 * This allows _dynamic casting_ of a row to its most specific type
797 * For example, for if a row in the `track` is actually a
798 `process_counter_track`, it's type column will be `process_counter_track`.