Status: COMPLETED · lalitm · 2025-09-30
Add support for importing pprof files into Perfetto Trace Processor and visualizing them with flame graphs in the Perfetto UI. This enables analysis of CPU/heap profiles from Go, C++, and other tools that generate pprof format within the Perfetto ecosystem.
This feature extends Perfetto's trace analysis capabilities to include non-time-based aggregate profiling data. Unlike existing profiling support which is integrated with timeline-based traces, pprof data represents standalone aggregate samples that are independent of time.
graph LR A[pprof file] --> B[PprofTraceReader] B --> C[aggregate_profile table] B --> D[aggregate_sample table] B --> E[stack_profile_* tables] C --> F[UI: Scope/Metric Selection] D --> F E --> F F --> G[Interactive Flamegraph]
The implementation builds upon existing Perfetto infrastructure:
stack_profile_* tables with new aggregate tablesTraceType + TraceReader patternZero-setup analysis: A pprof file can be analyzed with a single command or drag-and-drop.
Full format support: Support gzipped and uncompressed pprof protobuf files from any pprof-compatible tool.
Multiple metrics per file: Handle pprof files containing multiple value types (e.g., CPU samples + allocation counts) in a single visualization.
Interactive flame graphs: Provide full interactivity including zoom, search, and source location attribution where available.
No timeline confusion: Keep pprof data completely separate from time-based trace analysis to avoid user confusion.
The implementation supports the standard pprof format as defined by Google's pprof tool:
Gzipped format: Files compressed with gzip, as typically generated by most profiling tools.
Raw protobuf: Uncompressed protobuf files for development and testing.
Profile structure: Full support for the Profile protobuf message including:
The import pipeline automatically detects pprof files through a two-stage process:
1f 8b)sample_type fieldclass PprofTraceReader : public ChunkedTraceReader { public: explicit PprofTraceReader(TraceProcessorContext* context); base::Status Parse(TraceBlobView blob) override; base::Status NotifyEndOfFile() override; private: base::Status ParseProfile(); TraceProcessorContext* context_; std::vector<uint8_t> buffer_; };
The reader accumulates pprof data into an internal buffer and parses the complete protobuf message upon EOF notification.
The implementation introduces two new tables that integrate with existing stack profiling infrastructure:
-- Metadata for each profiling metric from pprof files CREATE TABLE aggregate_profile ( id INTEGER PRIMARY KEY, scope TEXT, -- file identifier (e.g., "cpu.pprof") name TEXT, -- display name (e.g., "pprof cpu") sample_type_type TEXT, -- pprof ValueType.type (e.g., "cpu") sample_type_unit TEXT -- pprof ValueType.unit (e.g., "nanoseconds") ); -- Sample values aggregated by callsite CREATE TABLE aggregate_sample ( id INTEGER PRIMARY KEY, aggregate_profile_id INTEGER, -- FK to aggregate_profile callsite_id INTEGER, -- FK to stack_profile_callsite value REAL -- sample count/value );
Each pprof location becomes a frame, callsites represent the full call chain from root to leaf, and samples aggregate values at each callsite.
All pprof files use a string table for deduplication. The importer builds a vector of strings from the protobuf string_table field.
For each pprof Mapping and Function:
stack_profile_mapping and populate frame metadataEach pprof Location represents a program counter with optional debug information:
stack_profile_frame entries with relative PCsFor each pprof Sample:
aggregate_sample tablePprof Sample → Location IDs [3,2,1] (leaf first)
↓
Perfetto Callsite hierarchy: 1 → 2 → 3 (root to leaf)
↓
Multiple aggregate_sample entries (one per value type)
The UI provides a dedicated page for pprof analysis accessible from the main navigation. The page automatically discovers available data and provides interactive controls.
Upon loading, the UI queries the database to discover:
// Discover available pprof data const scopesResult = await trace.engine.query(` SELECT DISTINCT scope FROM __intrinsic_aggregate_profile ORDER BY scope `); // Load metrics for selected scope const metricsResult = await trace.engine.query(` SELECT sample_type_type, sample_type_unit FROM __intrinsic_aggregate_profile WHERE scope = '${selectedScope}' `);
The implementation reuses Perfetto's existing QueryFlamegraph component with dynamically generated metrics:
const flamegraphMetrics = metricsFromTableOrSubquery( ` WITH metrics AS MATERIALIZED ( SELECT callsite_id, sum(sample.value) AS self_value FROM __intrinsic_aggregate_sample sample JOIN __intrinsic_aggregate_profile profile ON sample.aggregate_profile_id = profile.id WHERE profile.scope = '${scope}' AND profile.sample_type_type = '${metric}' GROUP BY callsite_id ) SELECT c.id, c.parent_id as parentId, c.name, c.mapping_name, coalesce(m.self_value, 0) AS self_value FROM _callstacks_for_stack_profile_samples!(metrics) AS c LEFT JOIN metrics AS m USING (callsite_id) `, [{ name: 'Pprof Samples', unit: unit, columnName: 'self_value' }], 'include perfetto module callstacks.stack_profile' );
This query leverages the existing _callstacks_for_stack_profile_samples! table function to build the complete flamegraph hierarchy while aggregating pprof sample values.
# Analyze a pprof file directly $ trace_processor_shell profile.pprof # Query available metrics > SELECT scope, sample_type_type, sample_type_unit FROM __intrinsic_aggregate_profile; # Examine sample data > SELECT COUNT(*) FROM __intrinsic_aggregate_sample WHERE aggregate_profile_id = 1;
For pprof files containing multiple value types (e.g., CPU samples + heap allocations):
Rather than building a standalone pprof viewer, this feature integrates pprof analysis into Perfetto's existing infrastructure. This provides:
Unified tooling: Users can analyze pprof data alongside other trace formats using the same UI and SQL interface.
Leveraged infrastructure: Reuses existing flame graph rendering, call stack handling, and database optimization.
Consistent UX: Familiar Perfetto interface for users already using the platform.
Timeline independence: pprof data represents aggregate samples without time dimension, kept completely separate from timeline-based trace analysis.
Static import model: pprof files are imported once and stored in read-only tables, avoiding complex re-aggregation logic.
Format-specific handling: Dedicated importer handles pprof-specific concepts while mapping to Perfetto's general profiling abstractions.
Zero cost when unused: No impact on existing Perfetto functionality when pprof features are not used.
Efficient storage: Sample values stored in aggregated form, avoiding redundant per-sample overhead.
Query optimization: Leverages existing database indices and table functions for optimal performance.