Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 1 | # CPU frequency and idle states |
| 2 | |
| 3 | This data source is available on Linux and Android (Since P). |
| 4 | It records changes in the CPU power management scheme through the |
| 5 | Linux kernel ftrace infrastructure. |
| 6 | It involves three aspects: |
| 7 | |
| 8 | #### Frequency scaling |
| 9 | |
Primiano Tucci | fc683bc | 2022-03-17 20:00:22 +0000 | [diff] [blame] | 10 | There are two way to get CPU frequency data: |
| 11 | |
| 12 | 1. Enabling the `power/cpu_frequency` ftrace event. (See |
| 13 | [TraceConfig](#traceconfig) below). This will record an event every time the |
| 14 | in-kernel cpufreq scaling driver changes the frequency. Note that this is not |
| 15 | supported on all platforms. In our experience it works reliably on ARM-based |
| 16 | SoCs but produces no data on most modern Intel-based platforms. This is |
| 17 | because recent Intel CPUs use an internal DVFS which is directly controlled |
| 18 | by the CPU, and that doesn't expose frequency change events to the kernel. |
| 19 | Also note that even on ARM-based platforms, the event is emitted only |
| 20 | when a CPU frequency changes. In many cases the CPU frequency won't |
| 21 | change for several seconds, which will show up as an empty block at the start |
| 22 | of the trace. |
| 23 | We suggest always combining this with polling (below) to get a reliable |
| 24 | snapshot of the initial frequency. |
| 25 | 2. Polling sysfs by enabling the `linux.sys_stats` data source and setting |
| 26 | `cpufreq_period_ms` to a value > 0. This will periodically poll |
| 27 | `/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq` and record the |
| 28 | current value in the trace buffer. Works on both Intel and ARM-based |
| 29 | platforms. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 30 | |
| 31 | On most Android devices the frequency scaling is per-cluster (group of |
| 32 | big/little cores) so it's not unusual to see groups of four CPUs changing |
| 33 | frequency at the same time. |
| 34 | |
Primiano Tucci | fc683bc | 2022-03-17 20:00:22 +0000 | [diff] [blame] | 35 | #### Available frequencies |
| 36 | |
| 37 | It is possible to record one-off also the full list of frequencies supported by |
| 38 | each CPU by enabling the `linux.system_info` data source. This will |
| 39 | record `/sys/devices/system/cpu/cpu*/cpufreq/scaling_available_frequencies` when |
| 40 | the trace recording start. This information is typically used to tell apart |
| 41 | big/little cores by inspecting the |
| 42 | [`cpu_freq` table](/docs/analysis/sql-tables.autogen#cpu_freq). |
| 43 | |
| 44 | This is not supported on modern Intel platforms for the same aforementioned |
| 45 | reasons of `power/cpu_frequency`. |
| 46 | |
| 47 | #### Idle states |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 48 | |
| 49 | When no threads are eligible to be executed (e.g. they are all in sleep states) |
| 50 | the kernel sets the CPU into an idle state, turning off some of the circuitry |
| 51 | to reduce idle power usage. Most modern CPUs have more than one idle state: |
| 52 | deeper idle states use less power but also require more time to resume from. |
| 53 | |
| 54 | Note that idle transitions are relatively fast and cheap, a CPU can enter and |
| 55 | leave idle states hundreds of times in a second. |
| 56 | Idle-ness must not be confused with full device suspend, which is a stronger and |
| 57 | more invasive power saving state (See below). CPUs can be idle even when the |
| 58 | screen is on and the device looks operational. |
| 59 | |
| 60 | The details about how many idle states are available and their semantic is |
| 61 | highly CPU/SoC specific. At the trace level, the idle state 0 means not-idle, |
| 62 | values greater than 0 represent increasingly deeper power saving states |
| 63 | (e.g., single core idle -> full package idle). |
| 64 | |
| 65 | Note that most Android devices won't enter idle states as long as the USB |
| 66 | cable is plugged in (the USB driver stack holds wakelocks). It is not unusual |
| 67 | to see only one idle state in traces collected through USB. |
| 68 | |
| 69 | On most SoCs the frequency has little value when the CPU is idle, as the CPU is |
| 70 | typically clock-gated in idle states. In those cases the frequency in the trace |
| 71 | happens to be the last frequency the CPU was running at before becoming idle. |
| 72 | |
| 73 | Known issues: |
| 74 | |
| 75 | * The event is emitted only when the frequency changes. This might |
| 76 | not happen for long periods of times. In short traces |
| 77 | it's possible that some CPU might not report any event, showing a gap on the |
| 78 | left-hand side of the trace, or none at all. Perfetto doesn't currently record |
| 79 | the initial cpu frequency when the trace is started. |
| 80 | |
| 81 | * Currently the UI doesn't render the cpufreq track if idle states (see below) |
| 82 | are not captured. This is a UI-only bug, data is recorded and query-able |
| 83 | through trace processor even if not displayed. |
| 84 | |
| 85 | ### UI |
| 86 | |
| 87 | In the UI, CPU frequency and idle-ness are shown on the same track. The height |
| 88 | of the track represents the frequency, the coloring represents the idle |
| 89 | state (colored: not-idle, gray: idle). Hovering or clicking a point in the |
| 90 | track will reveal both the frequency and the idle state: |
| 91 | |
| 92 |  |
| 93 | |
| 94 | ### SQL |
| 95 | |
| 96 | At the SQL level, both frequency and idle states are modeled as counters, |
| 97 | Note that the cpuidle value 0xffffffff (4294967295) means _back to not-idle_. |
| 98 | |
| 99 | ```sql |
| 100 | select ts, t.name, cpu, value from counter as c |
| 101 | left join cpu_counter_track as t on c.track_id = t.id |
| 102 | where t.name = 'cpuidle' or t.name = 'cpufreq' |
| 103 | ``` |
| 104 | |
| 105 | ts | name | cpu | value |
| 106 | ---|------|------|------ |
| 107 | 261187013242350 | cpuidle | 1 | 0 |
| 108 | 261187013246204 | cpuidle | 1 | 4294967295 |
| 109 | 261187013317818 | cpuidle | 1 | 0 |
| 110 | 261187013333027 | cpuidle | 0 | 0 |
| 111 | 261187013338287 | cpufreq | 0 | 1036800 |
| 112 | 261187013357922 | cpufreq | 1 | 1036800 |
| 113 | 261187013410735 | cpuidle | 1 | 4294967295 |
| 114 | 261187013451152 | cpuidle | 0 | 4294967295 |
| 115 | 261187013665683 | cpuidle | 1 | 0 |
| 116 | 261187013845058 | cpufreq | 0 | 1900800 |
| 117 | |
Primiano Tucci | fc683bc | 2022-03-17 20:00:22 +0000 | [diff] [blame] | 118 | The list of known CPU frequencies, can be queried using the |
| 119 | [`cpu_freq` table](/docs/analysis/sql-tables.autogen#cpu_freq). |
| 120 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 121 | ### TraceConfig |
| 122 | |
| 123 | ```protobuf |
Daniele Di Proietto | 436b326 | 2022-11-30 10:13:29 +0000 | [diff] [blame] | 124 | # Event-driven recording of frequency and idle state changes. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 125 | data_sources: { |
| 126 | config { |
| 127 | name: "linux.ftrace" |
| 128 | ftrace_config { |
| 129 | ftrace_events: "power/cpu_frequency" |
| 130 | ftrace_events: "power/cpu_idle" |
| 131 | ftrace_events: "power/suspend_resume" |
| 132 | } |
| 133 | } |
| 134 | } |
Primiano Tucci | fc683bc | 2022-03-17 20:00:22 +0000 | [diff] [blame] | 135 | |
Daniele Di Proietto | 436b326 | 2022-11-30 10:13:29 +0000 | [diff] [blame] | 136 | # Polling the current cpu frequency. |
Primiano Tucci | fc683bc | 2022-03-17 20:00:22 +0000 | [diff] [blame] | 137 | data_sources: { |
| 138 | config { |
| 139 | name: "linux.sys_stats" |
| 140 | sys_stats_config { |
| 141 | cpufreq_period_ms: 500 |
| 142 | } |
| 143 | } |
| 144 | } |
| 145 | |
Daniele Di Proietto | 436b326 | 2022-11-30 10:13:29 +0000 | [diff] [blame] | 146 | # Reporting the list of available frequency for each CPU. |
Primiano Tucci | fc683bc | 2022-03-17 20:00:22 +0000 | [diff] [blame] | 147 | data_sources { |
| 148 | config { |
| 149 | name: "linux.system_info" |
| 150 | } |
| 151 | } |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 152 | ``` |
| 153 | |
| 154 | ### Full-device suspend |
| 155 | |
| 156 | Full device suspend happens when a laptop is put in "sleep" mode (e.g. by |
| 157 | closing the lid) or when a smartphone display is turned off for enough time. |
| 158 | |
| 159 | When the device is suspended, most of the hardware units are turned off entering |
| 160 | the highest power-saving state possible (other than full shutdown). |
| 161 | |
| 162 | Note that most Android devices don't suspend immediately after dimming the |
| 163 | display but tend to do so if the display is forced off through the power button. |
| 164 | The details are highly device/manufacturer/kernel specific. |
| 165 | |
| 166 | Known issues: |
| 167 | |
| 168 | * The UI doesn't display clearly the suspended state. When an Android device |
| 169 | suspends it looks like as if all CPUs are running the kmigration thread and |
| 170 | one CPU is running the power HAL. |