Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 1 | # Heap profiler |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 2 | |
| 3 | NOTE: **heapprofd requires Android 10 or higher** |
| 4 | |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 5 | Heapprofd is a tool that tracks heap allocations & deallocations of an Android |
| 6 | process within a given time period. The resulting profile can be used to |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 7 | attribute memory usage to particular call-stacks, supporting a mix of both |
| 8 | native and java code. The tool can be used by Android platform and app |
| 9 | developers to investigate memory issues. |
| 10 | |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 11 | By default, the tool records native allocations and deallocations done with |
| 12 | malloc/free (or new/delete). It can be configured to record java heap memory |
| 13 | allocations instead: see [Java heap sampling](#java-heap-sampling) below. |
| 14 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 15 | On debug Android builds, you can profile all apps and most system services. |
| 16 | On "user" builds, you can only use it on apps with the debuggable or |
| 17 | profileable manifest flag. |
| 18 | |
| 19 | ## Quickstart |
| 20 | |
| 21 | See the [Memory Guide](/docs/case-studies/memory.md#heapprofd) for getting |
| 22 | started with heapprofd. |
| 23 | |
| 24 | ## UI |
| 25 | |
| 26 | Dumps from heapprofd are shown as flamegraphs in the UI after clicking on the |
| 27 | diamond. Each diamond corresponds to a snapshot of the allocations and |
| 28 | callstacks collected at that point in time. |
| 29 | |
| 30 |  |
| 31 | |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 32 |  |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 33 | |
| 34 | ## SQL |
| 35 | |
| 36 | Information about callstacks is written to the following tables: |
| 37 | |
| 38 | * [`stack_profile_mapping`](/docs/analysis/sql-tables.autogen#stack_profile_mapping) |
| 39 | * [`stack_profile_frame`](/docs/analysis/sql-tables.autogen#stack_profile_frame) |
| 40 | * [`stack_profile_callsite`](/docs/analysis/sql-tables.autogen#stack_profile_callsite) |
| 41 | |
| 42 | The allocations themselves are written to |
| 43 | [`heap_profile_allocation`](/docs/analysis/sql-tables.autogen#heap_profile_allocation). |
| 44 | |
| 45 | Offline symbolization data is stored in |
| 46 | [`stack_profile_symbol`](/docs/analysis/sql-tables.autogen#stack_profile_symbol). |
| 47 | |
| 48 | See [Example Queries](#heapprofd-example-queries) for example SQL queries. |
| 49 | |
| 50 | ## Recording |
| 51 | |
| 52 | Heapprofd can be configured and started in three ways. |
| 53 | |
| 54 | #### Manual configuration |
| 55 | |
| 56 | This requires manually setting the |
| 57 | [HeapprofdConfig](/docs/reference/trace-config-proto.autogen#HeapprofdConfig) |
| 58 | section of the trace config. The only benefit of doing so is that in this way |
| 59 | heap profiling can be enabled alongside any other tracing data sources. |
| 60 | |
| 61 | #### Using the tools/heap_profile script (recommended) |
| 62 | |
Florian Mayer | 6d9066a | 2020-09-09 17:09:00 +0100 | [diff] [blame] | 63 | You can use the `tools/heap_profile` script. If you are having trouble |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 64 | make sure you are using the |
| 65 | [latest version]( |
Lalit Maganti | 9e0146e | 2023-07-06 23:15:24 +0100 | [diff] [blame] | 66 | https://raw.githubusercontent.com/google/perfetto/main/tools/heap_profile). |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 67 | |
| 68 | You can target processes either by name (`-n com.example.myapp`) or by PID |
| 69 | (`-p 1234`). In the first case, the heap profile will be initiated on both on |
| 70 | already-running processes that match the package name and new processes launched |
| 71 | after the profiling session is started. |
| 72 | For the full arguments list see the |
| 73 | [heap_profile cmdline reference page](/docs/reference/heap_profile-cli). |
| 74 | |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 75 | You can use the [Perfetto UI](https://ui.perfetto.dev) to visualize heap dumps. |
| 76 | Upload the `raw-trace` file in your output directory. You will see all heap |
| 77 | dumps as diamonds on the timeline, click any of them to get a flamegraph. |
| 78 | |
| 79 | Alternatively [Speedscope](https://speedscope.app) can be used to visualize |
| 80 | the gzipped protos, but will only show the "Unreleased malloc size" view. |
| 81 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 82 | #### Using the Recording page of Perfetto UI |
| 83 | |
Tuchila Octavian | 4e04b30 | 2021-06-22 16:26:43 +0100 | [diff] [blame] | 84 | You can also use the [Perfetto UI](https://ui.perfetto.dev/#!/record/memory) |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 85 | to record heapprofd profiles. Tick "Heap profiling" in the trace configuration, |
| 86 | enter the processes you want to target, click "Add Device" to pair your phone, |
| 87 | and record profiles straight from your browser. This is also possible on |
| 88 | Windows. |
| 89 | |
| 90 | ## Viewing the data |
| 91 | |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 92 |  |
| 93 | |
| 94 | The resulting profile proto contains four views on the data, for each diamond. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 95 | |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 96 | * **Unreleased malloc size**: how many bytes were allocated but not freed at |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 97 | this callstack, from the moment the recording was started until the timestamp |
| 98 | of the diamond. |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 99 | * **Total malloc size**: how many bytes were allocated (including ones freed at |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 100 | the moment of the dump) at this callstack, from the moment the recording was |
| 101 | started until the timestamp of the diamond. |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 102 | * **Unreleased malloc count**: how many allocations without matching frees were |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 103 | done at this callstack, from the moment the recording was started until the |
| 104 | timestamp of the diamond. |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 105 | * **Total malloc count**: how many allocations (including ones with matching |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 106 | frees) were done at this callstack, from the moment the recording was started |
| 107 | started until the timestamp of the diamond. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 108 | |
| 109 | _(Googlers: You can also open the gzipped protos using http://pprof/)_ |
| 110 | |
| 111 | TIP: you might want to put `libart.so` as a "Hide regex" when profiling apps. |
| 112 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 113 | TIP: Click Left Heavy on the top left for a good visualization. |
| 114 | |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 115 | ## Continuous dumps |
| 116 | |
| 117 | By default, the heap profiler captures all the allocations from the beginning of |
| 118 | the recording and stores a single snapshot, shown as a single diamond in the UI, |
| 119 | which summarizes all allocations/frees. |
| 120 | |
| 121 | It is possible to configure the heap profiler to periodically (not just at the |
| 122 | end of the trace) store snapshots (continuous dumps), for example every 5000ms |
| 123 | |
| 124 | * By setting "Continuous dumps interval" in the UI to 5000. |
| 125 | * By adding |
| 126 | ``` |
| 127 | continuous_dump_config { |
| 128 | dump_interval_ms: 5000 |
| 129 | } |
| 130 | ``` |
| 131 | in the |
| 132 | [HeapprofdConfig](/docs/reference/trace-config-proto.autogen#HeapprofdConfig). |
| 133 | * By adding `-c 5000` to the invocation of |
| 134 | [`tools/heap_profile`](/docs/reference/heap_profile-cli). |
| 135 | |
| 136 |  |
| 137 | |
| 138 | The resulting visualization shows multiple diamonds. Clicking on each diamond |
| 139 | shows a summary of the allocations/frees from the beginning of the trace until |
| 140 | that point (i.e. the summary is cumulative). |
| 141 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 142 | ## Sampling interval |
| 143 | |
| 144 | Heapprofd samples heap allocations by hooking calls to malloc/free and C++'s |
| 145 | operator new/delete. Given a sampling interval of n bytes, one allocation is |
| 146 | sampled, on average, every n bytes allocated. This allows to reduce the |
| 147 | performance impact on the target process. The default sampling rate |
| 148 | is 4096 bytes. |
| 149 | |
| 150 | The easiest way to reason about this is to imagine the memory allocations as a |
| 151 | stream of one byte allocations. From this stream, every byte has a 1/n |
| 152 | probability of being selected as a sample, and the corresponding callstack |
| 153 | gets attributed the complete n bytes. For more accuracy, allocations larger than |
| 154 | the sampling interval bypass the sampling logic and are recorded with their true |
| 155 | size. |
Florian Mayer | 4158a05 | 2021-06-08 13:28:44 +0100 | [diff] [blame] | 156 | See the [heapprofd Sampling](/docs/design-docs/heapprofd-sampling) document for |
| 157 | details. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 158 | |
| 159 | ## Startup profiling |
| 160 | |
| 161 | When specifying a target process name (as opposite to the PID), new processes |
| 162 | matching that name are profiled from their startup. The resulting profile will |
| 163 | contain all allocations done between the start of the process and the end |
| 164 | of the profiling session. |
| 165 | |
| 166 | On Android, Java apps are usually not exec()-ed from scratch, but fork()-ed from |
| 167 | the [zygote], which then specializes into the desired app. If the app's name |
| 168 | matches a name specified in the profiling session, profiling will be enabled as |
| 169 | part of the zygote specialization. The resulting profile contains all |
| 170 | allocations done between that point in zygote specialization and the end of the |
| 171 | profiling session. Some allocations done early in the specialization process are |
| 172 | not accounted for. |
| 173 | |
| 174 | At the trace proto level, the resulting [ProfilePacket] will have the |
| 175 | `from_startup` field set to true in the corresponding `ProcessHeapSamples` |
| 176 | message. This is not surfaced in the converted pprof compatible proto. |
| 177 | |
| 178 | [ProfilePacket]: /docs/reference/trace-packet-proto.autogen#ProfilePacket |
| 179 | [zygote]: https://developer.android.com/topic/performance/memory-overview#SharingRAM |
| 180 | |
| 181 | ## Runtime profiling |
| 182 | |
| 183 | When a profiling session is started, all matching processes (by name or PID) |
Florian Mayer | 31b44b3 | 2020-10-08 15:08:41 +0100 | [diff] [blame] | 184 | are enumerated and are signalled to request profiling. Profiling isn't actually |
| 185 | enabled until a few hundred milliseconds after the next allocation that is |
| 186 | done by the application. If the application is idle when profiling is |
| 187 | requested, and then does a burst of allocations, these may be missed. |
| 188 | |
| 189 | The resulting profile will contain all allocations done between when profiling |
| 190 | is enabled, and the end of the profiling session. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 191 | |
| 192 | The resulting [ProfilePacket] will have `from_startup` set to false in the |
| 193 | corresponding `ProcessHeapSamples` message. This does not get surfaced in the |
| 194 | converted pprof compatible proto. |
| 195 | |
| 196 | ## Concurrent profiling sessions |
| 197 | |
| 198 | If multiple sessions name the same target process (either by name or PID), |
| 199 | only the first relevant session will profile the process. The other sessions |
| 200 | will report that the process had already been profiled when converting to |
| 201 | the pprof compatible proto. |
| 202 | |
| 203 | If you see this message but do not expect any other sessions, run |
| 204 | |
| 205 | ```shell |
| 206 | adb shell killall perfetto |
| 207 | ``` |
| 208 | |
| 209 | to stop any concurrent sessions that may be running. |
| 210 | |
| 211 | The resulting [ProfilePacket] will have `rejected_concurrent` set to true in |
| 212 | otherwise empty corresponding `ProcessHeapSamples` message. This does not get |
| 213 | surfaced in the converted pprof compatible proto. |
| 214 | |
| 215 | ## {#heapprofd-targets} Target processes |
| 216 | |
| 217 | Depending on the build of Android that heapprofd is run on, some processes |
| 218 | are not be eligible to be profiled. |
| 219 | |
| 220 | On _user_ (i.e. production, non-rootable) builds, only Java applications with |
| 221 | either the profileable or the debuggable manifest flag set can be profiled. |
| 222 | Profiling requests for non-profileable/debuggable processes will result in an |
| 223 | empty profile. |
| 224 | |
Primiano Tucci | a364520 | 2020-08-03 16:28:18 +0200 | [diff] [blame] | 225 | On userdebug builds, all processes except for a small set of critical |
| 226 | services can be profiled (to find the set of disallowed targets, look for |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 227 | `never_profile_heap` in [heapprofd.te]( |
Lalit Maganti | 9e0146e | 2023-07-06 23:15:24 +0100 | [diff] [blame] | 228 | https://cs.android.com/android/platform/superproject/+/main:system/sepolicy/private/heapprofd.te?q=never_profile_heap). |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 229 | This restriction can be lifted by disabling SELinux by running |
| 230 | `adb shell su root setenforce 0` or by passing `--disable-selinux` to the |
| 231 | `heap_profile` script. |
| 232 | |
| 233 | <center> |
| 234 | |
| 235 | | | userdebug setenforce 0 | userdebug | user | |
| 236 | |-------------------------|:----------------------:|:---------:|:----:| |
| 237 | | critical native service | Y | N | N | |
| 238 | | native service | Y | Y | N | |
| 239 | | app | Y | Y | N | |
| 240 | | profileable app | Y | Y | Y | |
| 241 | | debuggable app | Y | Y | Y | |
| 242 | |
| 243 | </center> |
| 244 | |
| 245 | To mark an app as profileable, put `<profileable android:shell="true"/>` into |
| 246 | the `<application>` section of the app manifest. |
| 247 | |
| 248 | ```xml |
| 249 | <manifest ...> |
| 250 | <application> |
| 251 | <profileable android:shell="true"/> |
| 252 | ... |
| 253 | </application> |
| 254 | </manifest> |
| 255 | ``` |
| 256 | |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 257 | ## {#java-heap-sampling} Java heap sampling |
| 258 | |
| 259 | NOTE: **Java heap sampling is available on Android 12 or higher** |
| 260 | |
| 261 | NOTE: **Java heap sampling is not to be confused with [Java heap |
| 262 | dumps](/docs/data-sources/java-heap-profiler.md)** |
| 263 | |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 264 | Heapprofd can be configured to track Java allocations instead of native ones. |
Daniele Di Proietto | 483fb59 | 2022-11-09 16:43:50 +0000 | [diff] [blame] | 265 | * By setting adding `heaps: "com.android.art"` in |
| 266 | [HeapprofdConfig](/docs/reference/trace-config-proto.autogen#HeapprofdConfig). |
| 267 | * By adding `--heaps com.android.art` to the invocation of |
| 268 | [`tools/heap_profile`](/docs/reference/heap_profile-cli). |
| 269 | |
| 270 | Unlike java heap dumps (which show the retention graph of a snapshot of the live |
| 271 | objects) but like native heap profiles, java heap samples show callstacks of |
| 272 | allocations over time of the entire profile. |
| 273 | |
| 274 | Java heap samples only show callstacks of when objects are created, not when |
| 275 | they're deleted or garbage collected. |
| 276 | |
| 277 |  |
| 278 | |
| 279 | The resulting profile proto contains two views on the data: |
| 280 | |
| 281 | * **Total allocation size**: how many bytes were allocated at this callstack |
| 282 | over time of the profile until this point. The bytes might have been freed or |
| 283 | not, the tool does not keep track of that. |
| 284 | * **Total allocation count**: how many object were allocated at this callstack |
| 285 | over time of the profile until this point. The objects might have been freed |
| 286 | or not, the tool does not keep track of that. |
| 287 | |
| 288 | Java heap samples are useful to understand memory churn showing the call stack |
| 289 | of which parts of the code large allocations are attributed to as well as the |
| 290 | allocation type from the ART runtime. |
| 291 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 292 | ## DEDUPED frames |
| 293 | |
| 294 | If the name of a Java method includes `[DEDUPED]`, this means that multiple |
| 295 | methods share the same code. ART only stores the name of a single one in its |
| 296 | metadata, which is displayed here. This is not necessarily the one that was |
| 297 | called. |
| 298 | |
| 299 | ## Triggering heap snapshots on demand |
| 300 | |
| 301 | Heap snapshot are recorded into the trace either at regular time intervals, if |
| 302 | using the `continuous_dump_config` field, or at the end of the session. |
| 303 | |
| 304 | You can also trigger a snapshot of all currently profiled processes by running |
| 305 | `adb shell killall -USR1 heapprofd`. This can be useful in lab tests for |
| 306 | recording the current memory usage of the target in a specific state. |
| 307 | |
| 308 | This dump will show up in addition to the dump at the end of the profile that is |
| 309 | always produced. You can create multiple of these dumps, and they will be |
| 310 | enumerated in the output directory. |
| 311 | |
| 312 | ## Symbolization |
| 313 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 314 | ### Set up llvm-symbolizer |
| 315 | |
| 316 | You only need to do this once. |
| 317 | |
| 318 | To use symbolization, your system must have llvm-symbolizer installed and |
| 319 | accessible from `$PATH` as `llvm-symbolizer`. On Debian, you can install it |
Daniele Di Proietto | 0394259 | 2022-07-08 13:53:19 +0100 | [diff] [blame] | 320 | using `sudo apt install llvm`. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 321 | |
| 322 | ### Symbolize your profile |
| 323 | |
| 324 | If the profiled binary or libraries do not have symbol names, you can |
| 325 | symbolize profiles offline. Even if they do, you might want to symbolize in |
| 326 | order to get inlined function and line number information. All tools |
| 327 | (traceconv, trace_processor_shell, the heap_profile script) support specifying |
| 328 | the `PERFETTO_BINARY_PATH` as an environment variable. |
| 329 | |
| 330 | ``` |
| 331 | PERFETTO_BINARY_PATH=somedir tools/heap_profile --name ${NAME} |
| 332 | ``` |
| 333 | |
| 334 | You can persist symbols for a trace by running |
| 335 | `PERFETTO_BINARY_PATH=somedir tools/traceconv symbolize raw-trace > symbols`. |
| 336 | You can then concatenate the symbols to the trace ( |
| 337 | `cat raw-trace symbols > symbolized-trace`) and the symbols will part of |
| 338 | `symbolized-trace`. The `tools/heap_profile` script will also generate this |
| 339 | file in your output directory, if `PERFETTO_BINARY_PATH` is used. |
| 340 | |
| 341 | The symbol file is the first with matching Build ID in the following order: |
| 342 | |
| 343 | 1. absolute path of library file relative to binary path. |
| 344 | 2. absolute path of library file relative to binary path, but with base.apk! |
| 345 | removed from filename. |
| 346 | 3. basename of library file relative to binary path. |
| 347 | 4. basename of library file relative to binary path, but with base.apk! |
| 348 | removed from filename. |
| 349 | 5. in the subdirectory .build-id: the first two hex digits of the build-id |
Florian Mayer | 82ca62d | 2020-06-09 19:38:06 +0200 | [diff] [blame] | 350 | as subdirectory, then the rest of the hex digits, with ".debug" appended. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 351 | See |
| 352 | https://fedoraproject.org/wiki/RolandMcGrath/BuildID#Find_files_by_build_ID |
| 353 | |
| 354 | For example, "/system/lib/base.apk!foo.so" with build id abcd1234, |
| 355 | is looked for at: |
| 356 | |
| 357 | 1. $PERFETTO_BINARY_PATH/system/lib/base.apk!foo.so |
| 358 | 2. $PERFETTO_BINARY_PATH/system/lib/foo.so |
| 359 | 3. $PERFETTO_BINARY_PATH/base.apk!foo.so |
| 360 | 4. $PERFETTO_BINARY_PATH/foo.so |
| 361 | 5. $PERFETTO_BINARY_PATH/.build-id/ab/cd1234.debug |
| 362 | |
Florian Mayer | 6d1f0ae | 2020-07-21 08:59:29 +0100 | [diff] [blame] | 363 | Alternatively, you can set the `PERFETTO_SYMBOLIZER_MODE` environment variable |
| 364 | to `index`, and the symbolizer will recursively search the given directory for |
| 365 | an ELF file with the given build id. This way, you will not have to worry |
| 366 | about correct filenames. |
| 367 | |
Florian Mayer | 0443332 | 2020-12-10 12:41:04 +0000 | [diff] [blame] | 368 | ## Deobfuscation |
| 369 | |
| 370 | If your profile contains obfuscated Java methods (like `fsd.a`), you can |
Florian Mayer | edc4810 | 2020-12-15 19:38:44 +0000 | [diff] [blame] | 371 | provide a deobfuscation map to turn them back into human readable. |
Florian Mayer | 0443332 | 2020-12-10 12:41:04 +0000 | [diff] [blame] | 372 | To do so, use the `PERFETTO_PROGUARD_MAP` environment variable, using the |
Daniele Di Proietto | 59fa7f1 | 2022-10-19 11:21:48 +0100 | [diff] [blame] | 373 | format `packagename=map_filename[:packagename=map_filename...]`, e.g. |
Florian Mayer | 0443332 | 2020-12-10 12:41:04 +0000 | [diff] [blame] | 374 | `PERFETTO_PROGUARD_MAP=com.example.pkg1=foo.txt:com.example.pkg2=bar.txt`. |
Florian Mayer | edc4810 | 2020-12-15 19:38:44 +0000 | [diff] [blame] | 375 | All tools |
| 376 | (traceconv, trace_processor_shell, the heap_profile script) support specifying |
| 377 | the `PERFETTO_PROGUARD_MAP` as an environment variable. |
| 378 | |
| 379 | You can get a deobfuscation map for your trace using |
| 380 | `tools/traceconv deobfuscate`. Then concatenate the resulting file to your |
Daniele Di Proietto | 59fa7f1 | 2022-10-19 11:21:48 +0100 | [diff] [blame] | 381 | trace to get a deobfuscated version of it (the input trace should be in the |
| 382 | perfetto format, otherwise concatenation will not produce a reasonable output). |
Florian Mayer | 0443332 | 2020-12-10 12:41:04 +0000 | [diff] [blame] | 383 | |
| 384 | ``` |
Daniele Di Proietto | 59fa7f1 | 2022-10-19 11:21:48 +0100 | [diff] [blame] | 385 | PERFETTO_PROGUARD_MAP=com.example.pkg=proguard_map.txt tools/traceconv deobfuscate ${TRACE} > deobfuscation_map |
Florian Mayer | 0443332 | 2020-12-10 12:41:04 +0000 | [diff] [blame] | 386 | cat ${TRACE} deobfuscation_map > deobfuscated_trace |
| 387 | ``` |
| 388 | |
Daniele Di Proietto | 59fa7f1 | 2022-10-19 11:21:48 +0100 | [diff] [blame] | 389 | `deobfuscated_trace` can be viewed in the |
| 390 | [Perfetto UI](https://ui.perfetto.dev). |
| 391 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 392 | ## Troubleshooting |
| 393 | |
| 394 | ### Buffer overrun |
| 395 | |
| 396 | If the rate of allocations is too high for heapprofd to keep up, the profiling |
| 397 | session will end early due to a buffer overrun. If the buffer overrun is |
| 398 | caused by a transient spike in allocations, increasing the shared memory buffer |
| 399 | size (passing `--shmem-size` to `tools/heap_profile`) can resolve the issue. |
| 400 | Otherwise the sampling interval can be increased (at the expense of lower |
| 401 | accuracy in the resulting profile) by passing `--interval=16000` or higher. |
| 402 | |
| 403 | ### Profile is empty |
| 404 | |
| 405 | Check whether your target process is eligible to be profiled by consulting |
James Zern | e385eb9 | 2020-11-06 10:55:21 -0800 | [diff] [blame] | 406 | [Target processes](#heapprofd-targets) above. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 407 | |
| 408 | Also check the [Known Issues](#known-issues). |
| 409 | |
| 410 | ### Implausible callstacks |
| 411 | |
| 412 | If you see a callstack that seems to impossible from looking at the code, make |
| 413 | sure no [DEDUPED frames](#deduped-frames) are involved. |
| 414 | |
| 415 | Also, if your code is linked using _Identical Code Folding_ |
| 416 | (ICF), i.e. passing `-Wl,--icf=...` to the linker, most trivial functions, often |
| 417 | constructors and destructors, can be aliased to binary-equivalent operators |
| 418 | of completely unrelated classes. |
| 419 | |
| 420 | ### Symbolization: Could not find library |
| 421 | |
| 422 | When symbolizing a profile, you might come across messages like this: |
| 423 | |
| 424 | ```bash |
| 425 | Could not find /data/app/invalid.app-wFgo3GRaod02wSvPZQ==/lib/arm64/somelib.so |
| 426 | (Build ID: 44b7138abd5957b8d0a56ce86216d478). |
| 427 | ``` |
| 428 | |
| 429 | Check whether your library (in this example somelib.so) exists in |
| 430 | `PERFETTO_BINARY_PATH`. Then compare the Build ID to the one in your |
| 431 | symbol file, which you can get by running |
| 432 | `readelf -n /path/in/binary/path/somelib.so`. If it does not match, the |
| 433 | symbolized file has a different version than the one on device, and cannot |
| 434 | be used for symbolization. |
| 435 | If it does, try moving somelib.so to the root of `PERFETTO_BINARY_PATH` and |
| 436 | try again. |
| 437 | |
| 438 | ### Only one frame shown |
| 439 | If you only see a single frame for functions in a specific library, make sure |
| 440 | that the library has unwind information. We need one of |
| 441 | |
| 442 | * `.gnu_debugdata` |
| 443 | * `.eh_frame` (+ preferably `.eh_frame_hdr`) |
| 444 | * `.debug_frame`. |
| 445 | |
| 446 | Frame-pointer unwinding is *not supported*. |
| 447 | |
| 448 | To check if an ELF file has any of those, run |
| 449 | |
| 450 | ```console |
| 451 | $ readelf -S file.so | grep "gnu_debugdata\|eh_frame\|debug_frame" |
| 452 | [12] .eh_frame_hdr PROGBITS 000000000000c2b0 0000c2b0 |
| 453 | [13] .eh_frame PROGBITS 0000000000011000 00011000 |
| 454 | [24] .gnu_debugdata PROGBITS 0000000000000000 000f7292 |
| 455 | ``` |
| 456 | |
| 457 | If this does not show one or more of the sections, change your build system |
| 458 | to not strip them. |
| 459 | |
Florian Mayer | 86d5a4e | 2020-10-09 14:16:15 +0100 | [diff] [blame] | 460 | ## (non-Android) Linux support |
| 461 | |
Florian Mayer | aea9ed4 | 2021-05-12 18:19:55 +0100 | [diff] [blame] | 462 | NOTE: Do not use this for production purposes. |
Florian Mayer | 86d5a4e | 2020-10-09 14:16:15 +0100 | [diff] [blame] | 463 | |
| 464 | You can use a standalone library to profile memory allocations on Linux. |
Florian Mayer | 5922a2d | 2021-05-12 17:35:41 +0100 | [diff] [blame] | 465 | First [build Perfetto](/docs/contributing/build-instructions.md). You only need |
| 466 | to do this once. |
Florian Mayer | 86d5a4e | 2020-10-09 14:16:15 +0100 | [diff] [blame] | 467 | |
| 468 | ``` |
Anna Mayzner | 5089d22 | 2022-11-07 16:25:55 +0000 | [diff] [blame] | 469 | tools/setup_all_configs.py |
Florian Mayer | 86d5a4e | 2020-10-09 14:16:15 +0100 | [diff] [blame] | 470 | ninja -C out/linux_clang_release |
| 471 | ``` |
| 472 | |
| 473 | Then, run traced |
| 474 | |
| 475 | ``` |
| 476 | out/linux_clang_release/traced |
| 477 | ``` |
| 478 | |
| 479 | Start the profile (e.g. targeting trace_processor_shell) |
| 480 | |
| 481 | ``` |
Florian Mayer | 5922a2d | 2021-05-12 17:35:41 +0100 | [diff] [blame] | 482 | tools/heap_profile -n trace_processor_shell --print-config | \ |
Florian Mayer | 86d5a4e | 2020-10-09 14:16:15 +0100 | [diff] [blame] | 483 | out/linux_clang_release/perfetto \ |
| 484 | -c - --txt \ |
Florian Mayer | 5922a2d | 2021-05-12 17:35:41 +0100 | [diff] [blame] | 485 | -o ~/heapprofd-trace |
Florian Mayer | 86d5a4e | 2020-10-09 14:16:15 +0100 | [diff] [blame] | 486 | ``` |
| 487 | |
| 488 | Finally, run your target (e.g. trace_processor_shell) with LD_PRELOAD |
| 489 | |
| 490 | ``` |
Florian Mayer | 7c52b26 | 2021-01-13 12:44:53 +0000 | [diff] [blame] | 491 | LD_PRELOAD=out/linux_clang_release/libheapprofd_glibc_preload.so out/linux_clang_release/trace_processor_shell <trace> |
Florian Mayer | 86d5a4e | 2020-10-09 14:16:15 +0100 | [diff] [blame] | 492 | ``` |
| 493 | |
| 494 | Then, Ctrl-C the Perfetto invocation and upload ~/heapprofd-trace to the |
| 495 | [Perfetto UI](https://ui.perfetto.dev). |
| 496 | |
Lalit Maganti | 94a969c | 2022-03-04 16:18:14 +0000 | [diff] [blame] | 497 | NOTE: by default, heapprofd lazily initalizes to avoid blocking your program's |
| 498 | main thread. However, if your program makes memory allocations on startup, |
| 499 | these can be missed. To avoid this from happening, set the enironment variable |
| 500 | `PERFETTO_HEAPPROFD_BLOCKING_INIT=1`; on the first malloc, your program will |
| 501 | be blocked until heapprofd initializes fully but means every allocation will |
| 502 | be correctly tracked. |
| 503 | |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 504 | ## Known Issues |
| 505 | |
Daniele Di Proietto | 5982e3b | 2022-11-16 10:18:05 +0000 | [diff] [blame] | 506 | ### {#known-issues-android13} Android 13 |
| 507 | |
| 508 | * Unwinding java frames might not work properly, depending on the ART module |
| 509 | version in use. The UI reports a single "unknown" frame at the top of the |
| 510 | stack in this case. The problem is fixed in Android 13 QPR1. |
| 511 | |
| 512 | ### {#known-issues-android12} Android 12 |
| 513 | |
| 514 | * Unwinding java frames might not work properly, depending on the ART module |
| 515 | version in use. The UI reports a single "unknown" frame at the top of the |
| 516 | stack in this case. |
| 517 | |
Florian Mayer | c4de391 | 2020-11-23 14:11:43 +0000 | [diff] [blame] | 518 | ### {#known-issues-android11} Android 11 |
Florian Mayer | cc61e5a | 2020-08-27 16:10:22 +0100 | [diff] [blame] | 519 | |
| 520 | * 32-bit programs cannot be targeted on 64-bit devices. |
Florian Mayer | 12494ee | 2020-09-23 16:25:58 +0100 | [diff] [blame] | 521 | * Setting `sampling_interval_bytes` to 0 crashes the target process. |
| 522 | This is an invalid config that should be rejected instead. |
Florian Mayer | fb36465 | 2020-11-05 10:38:52 +0000 | [diff] [blame] | 523 | * For startup profiles, some frame names might be missing. This will be |
| 524 | resolved in Android 12. |
Florian Mayer | c21ce02 | 2021-02-01 17:28:56 +0000 | [diff] [blame] | 525 | * `Failed to send control socket byte.` is displayed in logcat at the end of |
| 526 | every profile. This is benign. |
Florian Mayer | c035753 | 2021-02-16 16:22:58 +0000 | [diff] [blame] | 527 | * The object count may be incorrect in `dump_at_max` profiles. |
Florian Mayer | 0ce14ff | 2021-05-17 13:34:11 +0100 | [diff] [blame] | 528 | * Choosing a low shared memory buffer size and `block_client` mode might |
| 529 | lock up the target process. |
Florian Mayer | cc61e5a | 2020-08-27 16:10:22 +0100 | [diff] [blame] | 530 | |
Florian Mayer | c4de391 | 2020-11-23 14:11:43 +0000 | [diff] [blame] | 531 | ### {#known-issues-android10} Android 10 |
| 532 | * Function names in libraries with load bias might be incorrect. Use |
| 533 | [offline symbolization](#symbolization) to resolve this issue. |
| 534 | * For startup profiles, some frame names might be missing. This will be |
| 535 | resolved in Android 12. |
| 536 | * 32-bit programs cannot be targeted on 64-bit devices. |
Florian Mayer | 371d895 | 2021-03-17 13:53:23 +0000 | [diff] [blame] | 537 | * x86 / x86_64 platforms are not supported. This includes the Android |
| 538 | _Cuttlefish_. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 539 | emulator. |
Florian Mayer | c4de391 | 2020-11-23 14:11:43 +0000 | [diff] [blame] | 540 | * On ARM32, the bottom-most frame is always `ERROR 2`. This is harmless and |
| 541 | the callstacks are still complete. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 542 | * If heapprofd is run standalone (by running `heapprofd` in a root shell, rather |
| 543 | than through init), `/dev/socket/heapprofd` get assigned an incorrect SELinux |
| 544 | domain. You will not be able to profile any processes unless you disable |
| 545 | SELinux enforcement. |
| 546 | Run `restorecon /dev/socket/heapprofd` in a root shell to resolve. |
Florian Mayer | 9c7a8fb | 2020-06-18 15:36:00 +0200 | [diff] [blame] | 547 | * Using `vfork(2)` or `clone(2)` with `CLONE_VM` and allocating / freeing |
| 548 | memory in the child process will prematurely end the profile. |
| 549 | `java.lang.Runtime.exec` does this, calling it will prematurely end |
| 550 | the profile. Note that this is in violation of the POSIX standard. |
Florian Mayer | 12494ee | 2020-09-23 16:25:58 +0100 | [diff] [blame] | 551 | * Setting `sampling_interval_bytes` to 0 crashes the target process. |
| 552 | This is an invalid config that should be rejected instead. |
Florian Mayer | c21ce02 | 2021-02-01 17:28:56 +0000 | [diff] [blame] | 553 | * `Failed to send control socket byte.` is displayed in logcat at the end of |
| 554 | every profile. This is benign. |
Florian Mayer | c035753 | 2021-02-16 16:22:58 +0000 | [diff] [blame] | 555 | * The object count may be incorrect in `dump_at_max` profiles. |
Florian Mayer | 0ce14ff | 2021-05-17 13:34:11 +0100 | [diff] [blame] | 556 | * Choosing a low shared memory buffer size and `block_client` mode might |
| 557 | lock up the target process. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 558 | |
| 559 | ## Heapprofd vs malloc_info() vs RSS |
| 560 | |
| 561 | When using heapprofd and interpreting results, it is important to know the |
| 562 | precise meaning of the different memory metrics that can be obtained from the |
| 563 | operating system. |
| 564 | |
| 565 | **heapprofd** gives you the number of bytes the target program |
| 566 | requested from the default C/C++ allocator. If you are profiling a Java app from |
| 567 | startup, allocations that happen early in the application's initialization will |
| 568 | not be visible to heapprofd. Native services that do not fork from the Zygote |
| 569 | are not affected by this. |
| 570 | |
| 571 | **malloc\_info** is a libc function that gives you information about the |
| 572 | allocator. This can be triggered on userdebug builds by using |
| 573 | `am dumpheap -m <PID> /data/local/tmp/heap.txt`. This will in general be more |
| 574 | than the memory seen by heapprofd, depending on the allocator not all memory |
| 575 | is immediately freed. In particular, jemalloc retains some freed memory in |
| 576 | thread caches. |
| 577 | |
| 578 | **Heap RSS** is the amount of memory requested from the operating system by the |
| 579 | allocator. This is larger than the previous two numbers because memory can only |
| 580 | be obtained in page size chunks, and fragmentation causes some of that memory to |
| 581 | be wasted. This can be obtained by running `adb shell dumpsys meminfo <PID>` and |
| 582 | looking at the "Private Dirty" column. |
| 583 | RSS can also end up being smaller than the other two if the device kernel uses |
| 584 | memory compression (ZRAM, enabled by default on recent versions of android) and |
| 585 | the memory of the process get swapped out onto ZRAM. |
| 586 | |
| 587 | | | heapprofd | malloc\_info | RSS | |
| 588 | |---------------------|:-----------------:|:------------:|:---:| |
| 589 | | from native startup | x | x | x | |
| 590 | | after zygote init | x | x | x | |
| 591 | | before zygote init | | x | x | |
| 592 | | thread caches | | x | x | |
| 593 | | fragmentation | | | x | |
| 594 | |
| 595 | If you observe high RSS or malloc\_info metrics but heapprofd does not match, |
Florian Mayer | 27a43fb | 2021-05-19 11:06:10 +0100 | [diff] [blame] | 596 | you might be hitting some pathological fragmentation problem in the allocator. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 597 | |
| 598 | ## Convert to pprof |
| 599 | |
| 600 | You can use [traceconv](/docs/quickstart/traceconv.md) to convert the heap dumps |
| 601 | in a trace into the [pprof](https://github.com/google/pprof) format. These can |
| 602 | then be viewed using the pprof CLI or a UI (e.g. Speedscope, or Google-internal |
| 603 | pprof/). |
| 604 | |
| 605 | ```bash |
| 606 | tools/traceconv profile /tmp/profile |
| 607 | ``` |
| 608 | |
| 609 | This will create a directory in `/tmp/` containing the heap dumps. Run: |
| 610 | |
| 611 | ```bash |
| 612 | gzip /tmp/heap_profile-XXXXXX/*.pb |
| 613 | ``` |
| 614 | |
| 615 | to get gzipped protos, which tools handling pprof profile protos expect. |
| 616 | |
| 617 | ## {#heapprofd-example-queries} Example SQL Queries |
| 618 | |
| 619 | We can get the callstacks that allocated using an SQL Query in the |
| 620 | Trace Processor. For each frame, we get one row for the number of allocated |
| 621 | bytes, where `count` and `size` is positive, and, if any of them were already |
| 622 | freed, another line with negative `count` and `size`. The sum of those gets us |
Daniele Di Proietto | d92a37a | 2022-11-11 14:22:04 +0000 | [diff] [blame] | 623 | the `Unreleased malloc size` view. |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 624 | |
| 625 | ```sql |
| 626 | select a.callsite_id, a.ts, a.upid, f.name, f.rel_pc, m.build_id, m.name as mapping_name, |
| 627 | sum(a.size) as space_size, sum(a.count) as space_count |
| 628 | from heap_profile_allocation a join |
| 629 | stack_profile_callsite c ON (a.callsite_id = c.id) join |
| 630 | stack_profile_frame f ON (c.frame_id = f.id) join |
| 631 | stack_profile_mapping m ON (f.mapping = m.id) |
| 632 | group by 1, 2, 3, 4, 5, 6, 7 order by space_size desc; |
| 633 | ``` |
| 634 | |
| 635 | | callsite_id | ts | upid | name | rel_pc | build_id | mapping_name | space_size | space_count | |
| 636 | |-------------|----|------|-------|-----------|------|--------|----------|------| |
| 637 | |6660|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |106496|4| |
| 638 | |192 |5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |26624 |1| |
| 639 | |1421|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |26624 |1| |
| 640 | |1537|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |26624 |1| |
| 641 | |8843|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |26424 |1| |
| 642 | |8618|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |24576 |4| |
| 643 | |3750|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |12288 |1| |
| 644 | |2820|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |8192 |2| |
| 645 | |3788|5|1| malloc |244716| 8126fd.. | /apex/com.android.runtime/lib64/bionic/libc.so |8192 |2| |
| 646 | |
| 647 | We can see all the functions are "malloc" and "realloc", which is not terribly |
| 648 | informative. Usually we are interested in the _cumulative_ bytes allocated in |
| 649 | a function (otherwise, we will always only see malloc / realloc). Chasing the |
| 650 | parent_id of a callsite (not shown in this table) recursively is very hard in |
| 651 | SQL. |
| 652 | |
| 653 | There is an **experimental** table that surfaces this information. The **API is |
| 654 | subject to change**. |
| 655 | |
| 656 | ```sql |
Lalit Maganti | c07333b | 2024-01-24 01:12:49 +0000 | [diff] [blame] | 657 | select |
| 658 | name, |
| 659 | map_name, |
| 660 | cumulative_size |
| 661 | from experimental_flamegraph( |
| 662 | -- The type of the profile from which the flamegraph is being generated. |
| 663 | -- Always 'native' for native heap profiles. |
| 664 | 'native', |
| 665 | -- The timestamp of the heap profile. |
| 666 | 8300973884377, |
| 667 | -- Timestamp constraints: not relevant and always null for native heap |
| 668 | -- profiles. |
| 669 | NULL, |
| 670 | -- The upid of the heap profile. |
| 671 | 1, |
| 672 | -- The upid group: not relevant and always null for native heap profiles. |
| 673 | NULL, |
| 674 | -- A regex for focusing on a particular node in the heapgraph: for advanced |
| 675 | -- use only. |
| 676 | NULL |
| 677 | ) |
| 678 | order by abs(cumulative_size) desc; |
Tuchila Octavian | d94d624 | 2021-11-15 10:39:56 +0000 | [diff] [blame] | 679 | ``` |
Primiano Tucci | a662485 | 2020-05-21 19:12:50 +0100 | [diff] [blame] | 680 | |
| 681 | | name | map_name | cumulative_size | |
| 682 | |------|----------|----------------| |
| 683 | |__start_thread|/apex/com.android.runtime/lib64/bionic/libc.so|392608| |
| 684 | |_ZL15__pthread_startPv|/apex/com.android.runtime/lib64/bionic/libc.so|392608| |
| 685 | |_ZN13thread_data_t10trampolineEPKS|/system/lib64/libutils.so|199496| |
| 686 | |_ZN7android14AndroidRuntime15javaThreadShellEPv|/system/lib64/libandroid_runtime.so|199496| |
| 687 | |_ZN7android6Thread11_threadLoopEPv|/system/lib64/libutils.so|199496| |
| 688 | |_ZN3art6Thread14CreateCallbackEPv|/apex/com.android.art/lib64/libart.so|193112| |
| 689 | |_ZN3art35InvokeVirtualOrInterface...|/apex/com.android.art/lib64/libart.so|193112| |
| 690 | |_ZN3art9ArtMethod6InvokeEPNS_6ThreadEPjjPNS_6JValueEPKc|/apex/com.android.art/lib64/libart.so|193112| |
| 691 | |art_quick_invoke_stub|/apex/com.android.art/lib64/libart.so|193112| |