docs/design-docs/pprof-support.md - third_party/perfetto - Git at Google

 # pprof Support in Perfetto

 _**Status:** COMPLETED **·** lalitm **·** 2025-09-30_

 ## Objective

 Add support for importing pprof files into Perfetto Trace Processor and visualizing them with flame graphs in the Perfetto UI. This enables analysis of CPU/heap profiles from Go, C++, and other tools that generate pprof format within the Perfetto ecosystem.

 ## Overview

 This feature extends Perfetto's trace analysis capabilities to include non-time-based aggregate profiling data. Unlike existing profiling support which is integrated with timeline-based traces, pprof data represents standalone aggregate samples that are independent of time.

 ```mermaid
 graph LR
     A[pprof file] --> B[PprofTraceReader]
     B --> C[aggregate_profile table]
     B --> D[aggregate_sample table]
     B --> E[stack_profile_* tables]
     C --> F[UI: Scope/Metric Selection]
     D --> F
     E --> F
     F --> G[Interactive Flamegraph]
 ```

 The implementation builds upon existing Perfetto infrastructure:
 - **Database layer**: Extends existing `stack_profile_*` tables with new aggregate tables
 - **Import pipeline**: Follows the established `TraceType` + `TraceReader` pattern
 - **UI layer**: Leverages existing flame graph visualization components

 ### Requirements

 **Zero-setup analysis:** A pprof file can be analyzed with a single command or drag-and-drop.

 **Full format support:** Support gzipped and uncompressed pprof protobuf files from any pprof-compatible tool.

 **Multiple metrics per file:** Handle pprof files containing multiple value types (e.g., CPU samples + allocation counts) in a single visualization.

 **Interactive flame graphs:** Provide full interactivity including zoom, search, and source location attribution where available.

 **No timeline confusion:** Keep pprof data completely separate from time-based trace analysis to avoid user confusion.

 ## Detailed Design

 ### File Format Support

 The implementation supports the standard pprof format as defined by [Google's pprof tool](https://github.com/google/pprof/blob/main/proto/profile.proto):

 **Gzipped format:** Files compressed with gzip, as typically generated by most profiling tools.

 **Raw protobuf:** Uncompressed protobuf files for development and testing.

 **Profile structure:** Full support for the Profile protobuf message including:
 - String table for deduplicated strings
 - Sample data with location hierarchies
 - Function and mapping metadata
 - Multiple value types (CPU samples, allocations, etc.)

 ### Import Architecture

 #### File Detection

 The import pipeline automatically detects pprof files through a two-stage process:

 1. **Gzip detection:** Recognize gzipped files by magic bytes (`1f 8b`)
 2. **Protobuf validation:** After decompression, validate pprof structure by checking for Profile message with `sample_type` field

 #### PprofTraceReader

 ```cpp
 class PprofTraceReader : public ChunkedTraceReader {
  public:
   explicit PprofTraceReader(TraceProcessorContext* context);

   base::Status Parse(TraceBlobView blob) override;
   base::Status NotifyEndOfFile() override;

  private:
   base::Status ParseProfile();

   TraceProcessorContext* context_;
   std::vector<uint8_t> buffer_;
 };
 ```

 The reader accumulates pprof data into an internal buffer and parses the complete protobuf message upon EOF notification.

 ### Database Schema

 #### New Tables

 The implementation introduces two new tables that integrate with existing stack profiling infrastructure:

 ```sql
 -- Metadata for each profiling metric from pprof files
 CREATE TABLE aggregate_profile (
   id INTEGER PRIMARY KEY,
   scope TEXT,              -- file identifier (e.g., "cpu.pprof")
   name TEXT,               -- display name (e.g., "pprof cpu")
   sample_type_type TEXT,   -- pprof ValueType.type (e.g., "cpu")
   sample_type_unit TEXT    -- pprof ValueType.unit (e.g., "nanoseconds")
 );

 -- Sample values aggregated by callsite
 CREATE TABLE aggregate_sample (
   id INTEGER PRIMARY KEY,
   aggregate_profile_id INTEGER,  -- FK to aggregate_profile
   callsite_id INTEGER,           -- FK to stack_profile_callsite
   value REAL                     -- sample count/value
 );
 ```

 #### Integration with Existing Infrastructure

 - **stack_profile_frame:** Stores function name and source file information
 - **stack_profile_callsite:** Maintains call stack hierarchy from root to leaf
 - **stack_profile_mapping:** Contains binary/library mapping information

 Each pprof location becomes a frame, callsites represent the full call chain from root to leaf, and samples aggregate values at each callsite.

 ### Data Processing Pipeline

 #### Step 1: String Table Parsing

 All pprof files use a string table for deduplication. The importer builds a vector of strings from the protobuf `string_table` field.

 #### Step 2: Mapping and Function Creation

 For each pprof `Mapping` and `Function`:
 - Extract binary name, build ID, and memory ranges
 - Create entries in `stack_profile_mapping` and populate frame metadata
 - Build lookup tables for location resolution

 #### Step 3: Location Processing

 Each pprof `Location` represents a program counter with optional debug information:
 - Map addresses to existing or dummy memory mappings
 - Extract function names from associated line information
 - Create `stack_profile_frame` entries with relative PCs

 #### Step 4: Sample Processing

 For each pprof `Sample`:
 - Build complete callsite hierarchy from location chain (reversing pprof leaf-first order)
 - Create aggregate entries for each value type in the sample
 - Link samples to callsites through `aggregate_sample` table

 ```
 Pprof Sample → Location IDs [3,2,1] (leaf first)
              ↓
 Perfetto Callsite hierarchy: 1 → 2 → 3 (root to leaf)
                             ↓
 Multiple aggregate_sample entries (one per value type)
 ```

 ### UI Implementation

 #### PprofPage Component

 The UI provides a dedicated page for pprof analysis accessible from the main navigation. The page automatically discovers available data and provides interactive controls.

 #### Dynamic Data Discovery

 Upon loading, the UI queries the database to discover:

 1. **Available scopes** (typically one per imported pprof file)
 2. **Available metrics** within each scope (CPU, allocations, etc.)
 3. **Sample data** for the selected scope/metric combination

 ```typescript
 // Discover available pprof data
 const scopesResult = await trace.engine.query(`
   SELECT DISTINCT scope FROM __intrinsic_aggregate_profile ORDER BY scope
 `);

 // Load metrics for selected scope
 const metricsResult = await trace.engine.query(`
   SELECT sample_type_type, sample_type_unit
   FROM __intrinsic_aggregate_profile
   WHERE scope = '${selectedScope}'
 `);
 ```

 #### Flamegraph Integration

 The implementation reuses Perfetto's existing `QueryFlamegraph` component with dynamically generated metrics:

 ```typescript
 const flamegraphMetrics = metricsFromTableOrSubquery(
   `
     WITH metrics AS MATERIALIZED (
       SELECT
         callsite_id,
         sum(sample.value) AS self_value
       FROM __intrinsic_aggregate_sample sample
       JOIN __intrinsic_aggregate_profile profile
         ON sample.aggregate_profile_id = profile.id
       WHERE profile.scope = '${scope}'
         AND profile.sample_type_type = '${metric}'
       GROUP BY callsite_id
     )
     SELECT
       c.id,
       c.parent_id as parentId,
       c.name,
       c.mapping_name,
       coalesce(m.self_value, 0) AS self_value
     FROM _callstacks_for_stack_profile_samples!(metrics) AS c
     LEFT JOIN metrics AS m USING (callsite_id)
   `,
   [{ name: 'Pprof Samples', unit: unit, columnName: 'self_value' }],
   'include perfetto module callstacks.stack_profile'
 );
 ```

 This query leverages the existing `_callstacks_for_stack_profile_samples!` table function to build the complete flamegraph hierarchy while aggregating pprof sample values.

 ### Usage

 #### Command Line Analysis

 ```bash
 # Analyze a pprof file directly
 $ trace_processor_shell profile.pprof

 # Query available metrics
 > SELECT scope, sample_type_type, sample_type_unit
   FROM __intrinsic_aggregate_profile;

 # Examine sample data
 > SELECT COUNT(*) FROM __intrinsic_aggregate_sample
   WHERE aggregate_profile_id = 1;
 ```

 #### Web UI Analysis

 1. **File loading**: Drag and drop pprof file into Perfetto UI or use file picker
 2. **Automatic detection**: Perfetto recognizes pprof format and imports data
 3. **Navigation**: Go to "Pprof" page from main navigation
 4. **Interactive analysis**: Select scope/metric and explore flame graph

 #### Multi-metric Files

 For pprof files containing multiple value types (e.g., CPU samples + heap allocations):

 1. **Single import**: All metrics from one file imported together under same scope
 2. **Metric switching**: UI dropdown allows switching between metrics instantly
 3. **Independent analysis**: Each metric displays as separate flame graph

 ## Design Principles

 ### Integration over Replacement

 Rather than building a standalone pprof viewer, this feature integrates pprof analysis into Perfetto's existing infrastructure. This provides:

 **Unified tooling:** Users can analyze pprof data alongside other trace formats using the same UI and SQL interface.

 **Leveraged infrastructure:** Reuses existing flame graph rendering, call stack handling, and database optimization.

 **Consistent UX:** Familiar Perfetto interface for users already using the platform.

 ### Separation of Concerns

 **Timeline independence:** pprof data represents aggregate samples without time dimension, kept completely separate from timeline-based trace analysis.

 **Static import model:** pprof files are imported once and stored in read-only tables, avoiding complex re-aggregation logic.

 **Format-specific handling:** Dedicated importer handles pprof-specific concepts while mapping to Perfetto's general profiling abstractions.

 ### Minimal Overhead

 **Zero cost when unused:** No impact on existing Perfetto functionality when pprof features are not used.

 **Efficient storage:** Sample values stored in aggregated form, avoiding redundant per-sample overhead.

 **Query optimization:** Leverages existing database indices and table functions for optimal performance.
	# pprof Support in Perfetto

	_Status: COMPLETED · lalitm · 2025-09-30_

	## Objective

	Add support for importing pprof files into Perfetto Trace Processor and visualizing them with flame graphs in the Perfetto UI. This enables analysis of CPU/heap profiles from Go, C++, and other tools that generate pprof format within the Perfetto ecosystem.

	## Overview

	This feature extends Perfetto's trace analysis capabilities to include non-time-based aggregate profiling data. Unlike existing profiling support which is integrated with timeline-based traces, pprof data represents standalone aggregate samples that are independent of time.

	```mermaid
	graph LR
	A[pprof file] --> B[PprofTraceReader]
	B --> C[aggregate_profile table]
	B --> D[aggregate_sample table]
	B --> E[stack_profile_* tables]
	C --> F[UI: Scope/Metric Selection]
	D --> F
	E --> F
	F --> G[Interactive Flamegraph]
	```

	The implementation builds upon existing Perfetto infrastructure:
	- Database layer: Extends existing `stack_profile_*` tables with new aggregate tables
	- Import pipeline: Follows the established `TraceType` + `TraceReader` pattern
	- UI layer: Leverages existing flame graph visualization components

	### Requirements

	Zero-setup analysis: A pprof file can be analyzed with a single command or drag-and-drop.

	Full format support: Support gzipped and uncompressed pprof protobuf files from any pprof-compatible tool.

	Multiple metrics per file: Handle pprof files containing multiple value types (e.g., CPU samples + allocation counts) in a single visualization.

	Interactive flame graphs: Provide full interactivity including zoom, search, and source location attribution where available.

	No timeline confusion: Keep pprof data completely separate from time-based trace analysis to avoid user confusion.

	## Detailed Design

	### File Format Support

	The implementation supports the standard pprof format as defined by [Google's pprof tool](https://github.com/google/pprof/blob/main/proto/profile.proto):

	Gzipped format: Files compressed with gzip, as typically generated by most profiling tools.

	Raw protobuf: Uncompressed protobuf files for development and testing.

	Profile structure: Full support for the Profile protobuf message including:
	- String table for deduplicated strings
	- Sample data with location hierarchies
	- Function and mapping metadata
	- Multiple value types (CPU samples, allocations, etc.)

	### Import Architecture

	#### File Detection

	The import pipeline automatically detects pprof files through a two-stage process:

	1. Gzip detection: Recognize gzipped files by magic bytes (`1f 8b`)
	2. Protobuf validation: After decompression, validate pprof structure by checking for Profile message with `sample_type` field

	#### PprofTraceReader

	```cpp
	class PprofTraceReader : public ChunkedTraceReader {
	public:
	explicit PprofTraceReader(TraceProcessorContext* context);

	base::Status Parse(TraceBlobView blob) override;
	base::Status NotifyEndOfFile() override;

	private:
	base::Status ParseProfile();

	TraceProcessorContext* context_;
	std::vector<uint8_t> buffer_;
	};
	```

	The reader accumulates pprof data into an internal buffer and parses the complete protobuf message upon EOF notification.

	### Database Schema

	#### New Tables

	The implementation introduces two new tables that integrate with existing stack profiling infrastructure:

	```sql
	-- Metadata for each profiling metric from pprof files
	CREATE TABLE aggregate_profile (
	id INTEGER PRIMARY KEY,
	scope TEXT, -- file identifier (e.g., "cpu.pprof")
	name TEXT, -- display name (e.g., "pprof cpu")
	sample_type_type TEXT, -- pprof ValueType.type (e.g., "cpu")
	sample_type_unit TEXT -- pprof ValueType.unit (e.g., "nanoseconds")
	);

	-- Sample values aggregated by callsite
	CREATE TABLE aggregate_sample (
	id INTEGER PRIMARY KEY,
	aggregate_profile_id INTEGER, -- FK to aggregate_profile
	callsite_id INTEGER, -- FK to stack_profile_callsite
	value REAL -- sample count/value
	);
	```

	#### Integration with Existing Infrastructure

	- stack_profile_frame: Stores function name and source file information
	- stack_profile_callsite: Maintains call stack hierarchy from root to leaf
	- stack_profile_mapping: Contains binary/library mapping information

	Each pprof location becomes a frame, callsites represent the full call chain from root to leaf, and samples aggregate values at each callsite.

	### Data Processing Pipeline

	#### Step 1: String Table Parsing

	All pprof files use a string table for deduplication. The importer builds a vector of strings from the protobuf `string_table` field.

	#### Step 2: Mapping and Function Creation

	For each pprof `Mapping` and `Function`:
	- Extract binary name, build ID, and memory ranges
	- Create entries in `stack_profile_mapping` and populate frame metadata
	- Build lookup tables for location resolution

	#### Step 3: Location Processing

	Each pprof `Location` represents a program counter with optional debug information:
	- Map addresses to existing or dummy memory mappings
	- Extract function names from associated line information
	- Create `stack_profile_frame` entries with relative PCs

	#### Step 4: Sample Processing

	For each pprof `Sample`:
	- Build complete callsite hierarchy from location chain (reversing pprof leaf-first order)
	- Create aggregate entries for each value type in the sample
	- Link samples to callsites through `aggregate_sample` table

	```
	Pprof Sample → Location IDs [3,2,1] (leaf first)
	↓
	Perfetto Callsite hierarchy: 1 → 2 → 3 (root to leaf)
	↓
	Multiple aggregate_sample entries (one per value type)
	```

	### UI Implementation

	#### PprofPage Component

	The UI provides a dedicated page for pprof analysis accessible from the main navigation. The page automatically discovers available data and provides interactive controls.

	#### Dynamic Data Discovery

	Upon loading, the UI queries the database to discover:

	1. Available scopes (typically one per imported pprof file)
	2. Available metrics within each scope (CPU, allocations, etc.)
	3. Sample data for the selected scope/metric combination

	```typescript
	// Discover available pprof data
	const scopesResult = await trace.engine.query(`
	SELECT DISTINCT scope FROM __intrinsic_aggregate_profile ORDER BY scope
	`);

	// Load metrics for selected scope
	const metricsResult = await trace.engine.query(`
	SELECT sample_type_type, sample_type_unit
	FROM __intrinsic_aggregate_profile
	WHERE scope = '${selectedScope}'
	`);
	```

	#### Flamegraph Integration

	The implementation reuses Perfetto's existing `QueryFlamegraph` component with dynamically generated metrics:

	```typescript
	const flamegraphMetrics = metricsFromTableOrSubquery(
	`
	WITH metrics AS MATERIALIZED (
	SELECT
	callsite_id,
	sum(sample.value) AS self_value
	FROM __intrinsic_aggregate_sample sample
	JOIN __intrinsic_aggregate_profile profile
	ON sample.aggregate_profile_id = profile.id
	WHERE profile.scope = '${scope}'
	AND profile.sample_type_type = '${metric}'
	GROUP BY callsite_id
	)
	SELECT
	c.id,
	c.parent_id as parentId,
	c.name,
	c.mapping_name,
	coalesce(m.self_value, 0) AS self_value
	FROM _callstacks_for_stack_profile_samples!(metrics) AS c
	LEFT JOIN metrics AS m USING (callsite_id)
	`,
	[{ name: 'Pprof Samples', unit: unit, columnName: 'self_value' }],
	'include perfetto module callstacks.stack_profile'
	);
	```

	This query leverages the existing `_callstacks_for_stack_profile_samples!` table function to build the complete flamegraph hierarchy while aggregating pprof sample values.

	### Usage

	#### Command Line Analysis

	```bash
	# Analyze a pprof file directly
	$ trace_processor_shell profile.pprof

	# Query available metrics
	> SELECT scope, sample_type_type, sample_type_unit
	FROM __intrinsic_aggregate_profile;

	# Examine sample data
	> SELECT COUNT(*) FROM __intrinsic_aggregate_sample
	WHERE aggregate_profile_id = 1;
	```

	#### Web UI Analysis

	1. File loading: Drag and drop pprof file into Perfetto UI or use file picker
	2. Automatic detection: Perfetto recognizes pprof format and imports data
	3. Navigation: Go to "Pprof" page from main navigation
	4. Interactive analysis: Select scope/metric and explore flame graph

	#### Multi-metric Files

	For pprof files containing multiple value types (e.g., CPU samples + heap allocations):

	1. Single import: All metrics from one file imported together under same scope
	2. Metric switching: UI dropdown allows switching between metrics instantly
	3. Independent analysis: Each metric displays as separate flame graph

	## Design Principles

	### Integration over Replacement

	Rather than building a standalone pprof viewer, this feature integrates pprof analysis into Perfetto's existing infrastructure. This provides:

	Unified tooling: Users can analyze pprof data alongside other trace formats using the same UI and SQL interface.

	Leveraged infrastructure: Reuses existing flame graph rendering, call stack handling, and database optimization.

	Consistent UX: Familiar Perfetto interface for users already using the platform.

	### Separation of Concerns

	Timeline independence: pprof data represents aggregate samples without time dimension, kept completely separate from timeline-based trace analysis.

	Static import model: pprof files are imported once and stored in read-only tables, avoiding complex re-aggregation logic.

	Format-specific handling: Dedicated importer handles pprof-specific concepts while mapping to Perfetto's general profiling abstractions.

	### Minimal Overhead

	Zero cost when unused: No impact on existing Perfetto functionality when pprof features are not used.

	Efficient storage: Sample values stored in aggregated form, avoiding redundant per-sample overhead.

	Query optimization: Leverages existing database indices and table functions for optimal performance.