| # Debugging memory usage on Android | 
 |  | 
 | ## Prerequisites | 
 |  | 
 | * A host running macOS or Linux. | 
 | * [ADB](https://developer.android.com/studio/command-line/adb) installed and | 
 |   in PATH. | 
 | * A device running Android 11+. | 
 |  | 
 | If you are profiling your own app and are not running a userdebug build of | 
 | Android, your app needs to be marked as profileable or | 
 | debuggable in its manifest. See the [heapprofd documentation]( | 
 | /docs/data-sources/native-heap-profiler.md#heapprofd-targets) for more | 
 | details on which applications can be targeted. | 
 |  | 
 | ## dumpsys meminfo | 
 |  | 
 | A good place to get started investigating memory usage of a process is | 
 | `dumpsys meminfo` which gives a high-level overview of how much of the various | 
 | types of memory are being used by a process. | 
 |  | 
 | ```bash | 
 | $ adb shell dumpsys meminfo com.android.systemui | 
 |  | 
 | Applications Memory Usage (in Kilobytes): | 
 | Uptime: 2030149 Realtime: 2030149 | 
 |  | 
 | ** MEMINFO in pid 1974 [com.android.systemui] ** | 
 |                    Pss  Private  Private  SwapPss      Rss     Heap     Heap     Heap | 
 |                  Total    Dirty    Clean    Dirty    Total     Size    Alloc     Free | 
 |                 ------   ------   ------   ------   ------   ------   ------   ------ | 
 |   Native Heap    16840    16804        0     6764    19428    34024    25037     5553 | 
 |   Dalvik Heap     9110     9032        0      136    13164    36444     9111    27333 | 
 |  | 
 | [more stuff...] | 
 | ``` | 
 |  | 
 | Looking at the "Private Dirty" column of Dalvik Heap (= Java Heap) and | 
 | Native Heap, we can see that SystemUI's memory usage on the Java heap | 
 | is 9M, on the native heap it's 17M. | 
 |  | 
 | ## Linux memory management | 
 |  | 
 | But what does *clean*, *dirty*, *Rss*, *Pss*, *Swap* actually mean? To answer | 
 | this question, we need to delve into Linux memory management a bit. | 
 |  | 
 | From the kernel's point of view, memory is split into equally sized blocks | 
 | called *pages*. These are generally 4KiB. | 
 |  | 
 | Pages are organized in virtually contiguous ranges called VMA | 
 | (Virtual Memory Area). | 
 |  | 
 | VMAs are created when a process requests a new pool of memory pages through | 
 | the [mmap() system call](https://man7.org/linux/man-pages/man2/mmap.2.html). | 
 | Applications rarely call mmap() directly. Those calls are typically mediated by | 
 | the allocator, `malloc()/operator new()` for native processes or by the | 
 | Android RunTime for Java apps. | 
 |  | 
 | VMAs can be of two types: file-backed and anonymous. | 
 |  | 
 | **File-backed VMAs** are a view of a file in memory. They are obtained passing a | 
 | file descriptor to `mmap()`. The kernel will serve page faults on the VMA | 
 | through the passed file, so reading a pointer to the VMA becomes the equivalent | 
 | of a `read()` on the file. | 
 | File-backed VMAs are used, for instance, by the dynamic linker (`ld`) when | 
 | executing new processes or dynamically loading libraries, or by the Android | 
 | framework, when loading a new .dex library or accessing resources in the APK. | 
 |  | 
 | **Anonymous VMAs** are memory-only areas not backed by any file. This is the way | 
 | allocators request dynamic memory from the kernel. Anonymous VMAs are obtained | 
 | calling `mmap(... MAP_ANONYMOUS ...)`. | 
 |  | 
 | Physical memory is only allocated, in page granularity, once the application | 
 | tries to read/write from a VMA. If you allocate 32 MiB worth of pages but only | 
 | touch one byte, your process' memory usage will only go up by 4KiB. You will | 
 | have increased your process' *virtual memory* by 32 MiB, but its resident | 
 | *physical memory* by 4 KiB. | 
 |  | 
 | When optimizing memory use of programs, we are interested in reducing their | 
 | footprint in *physical memory*. High *virtual memory* use is generally not a | 
 | cause for concern on modern platforms (except if you run out of address space, | 
 | which is very hard on 64 bit systems). | 
 |  | 
 | We call the amount a process' memory that is resident in *physical memory* its | 
 | **RSS** (Resident Set Size). Not all resident memory is equal though. | 
 |  | 
 | From a memory-consumption viewpoint, individual pages within a VMA can have the | 
 | following states: | 
 |  | 
 | * **Resident**: the page is mapped to a physical memory page. Resident pages can | 
 |   be in two states: | 
 |     * **Clean** (only for file-backed pages): the contents of the page are the | 
 |       same of the contents on-disk. The kernel can evict clean pages more easily | 
 |       in case of memory pressure. This is because if they should be needed | 
 |       again, the kernel knows it can re-create its contents by reading them from | 
 |       the underlying file. | 
 |     * **Dirty**: the contents of the page diverge from the disk, or (in most | 
 |       cases), the page has no disk backing (i.e. it's _anonymous_). Dirty pages | 
 |       cannot be evicted because doing so would cause data loss. However they can | 
 |       be swapped out on disk or ZRAM, if present. | 
 | * **Swapped**: a dirty page can be written to the swap file on disk (on most Linux | 
 |   desktop distributions) or compressed (on Android and CrOS through | 
 |   [ZRAM](https://source.android.com/devices/tech/perf/low-ram#zram)). The page | 
 |   will stay swapped until a new page fault on its virtual address happens, at | 
 |   which point the kernel will bring it back in main memory. | 
 | * **Not present**: no page fault ever happened on the page or the page was | 
 |   clean and later was evicted. | 
 |  | 
 | It is generally more important to reduce the amount of _dirty_ memory as that | 
 | cannot be reclaimed like _clean_ memory and, on Android, even if swapped in | 
 | ZRAM, will still eat part of the system memory budget. | 
 | This is why we looked at *Private Dirty* in the `dumpsys meminfo` example. | 
 |  | 
 | *Shared* memory can be mapped into more than one process. This means VMAs in | 
 | different processes refer to the same physical memory. This typically happens | 
 | with file-backed memory of commonly used libraries (e.g., libc.so, | 
 | framework.dex) or, more rarely, when a process `fork()`s and a child process | 
 | inherits dirty memory from its parent. | 
 |  | 
 | This introduces the concept of **PSS** (Proportional Set Size). In **PSS**, | 
 | memory that is resident in multiple processes is proportionally attributed to | 
 | each of them. If we map one 4KiB page into four processes, each of their | 
 | **PSS** will increase by 1KiB. | 
 |  | 
 | #### Recap | 
 |  | 
 | * Dynamically allocated memory, whether allocated through C's `malloc()`, C++'s | 
 |   `operator new()` or Java's `new X()` starts always as _anonymous_ and _dirty_, | 
 |   unless it is never used. | 
 | * If this memory is not read/written for a while, or in case of memory pressure, | 
 |   it gets swapped out on ZRAM and becomes _swapped_. | 
 | * Anonymous memory, whether _resident_ (and hence _dirty_) or _swapped_ is | 
 |   always a resource hog and should be avoided if unnecessary. | 
 | * File-mapped memory comes from code (java or native), libraries and resource | 
 |   and is almost always _clean_. Clean memory also erodes the system memory | 
 |   budget but typically application developers have less control on it. | 
 |  | 
 | ## Memory over time | 
 |  | 
 | `dumpsys meminfo` is good to get a snapshot of the current memory usage, but | 
 | even very short memory spikes can lead to low-memory situations, which will | 
 | lead to [LMKs](#lmk). We have two tools to investigate situations like this | 
 |  | 
 | * RSS High Watermark. | 
 | * Memory tracepoints. | 
 |  | 
 | ### RSS High Watermark | 
 |  | 
 | We can get a lot of information from the `/proc/[pid]/status` file, including | 
 | memory information. `VmHWM` shows the maximum RSS usage the process has seen | 
 | since it was started. This value is kept updated by the kernel. | 
 |  | 
 | ```bash | 
 | $ adb shell cat '/proc/$(pidof com.android.systemui)/status' | 
 | [...] | 
 | VmHWM:    256972 kB | 
 | VmRSS:    195272 kB | 
 | RssAnon:  30184 kB | 
 | RssFile:  164420 kB | 
 | RssShmem: 668 kB | 
 | VmSwap:   43960 kB | 
 | [...] | 
 | ``` | 
 |  | 
 | ### Memory tracepoints | 
 |  | 
 | NOTE: For detailed instructions about the memory trace points see the | 
 |       [Data sources > Memory > Counters and events]( | 
 |       /docs/data-sources/memory-counters.md) page. | 
 |  | 
 | We can use Perfetto to get information about memory management events from the | 
 | kernel. | 
 |  | 
 | ```bash | 
 | $ adb shell perfetto \ | 
 |   -c - --txt \ | 
 |   -o /data/misc/perfetto-traces/trace \ | 
 | <<EOF | 
 |  | 
 | buffers: { | 
 |     size_kb: 8960 | 
 |     fill_policy: DISCARD | 
 | } | 
 | buffers: { | 
 |     size_kb: 1280 | 
 |     fill_policy: DISCARD | 
 | } | 
 | data_sources: { | 
 |     config { | 
 |         name: "linux.process_stats" | 
 |         target_buffer: 1 | 
 |         process_stats_config { | 
 |             scan_all_processes_on_start: true | 
 |         } | 
 |     } | 
 | } | 
 | data_sources: { | 
 |     config { | 
 |         name: "linux.ftrace" | 
 |         ftrace_config { | 
 |             ftrace_events: "mm_event/mm_event_record" | 
 |             ftrace_events: "kmem/rss_stat" | 
 |             ftrace_events: "kmem/ion_heap_grow" | 
 |             ftrace_events: "kmem/ion_heap_shrink" | 
 |         } | 
 |     } | 
 | } | 
 | duration_ms: 30000 | 
 |  | 
 | EOF | 
 | ``` | 
 |  | 
 | While it is running, take a photo if you are following along. | 
 |  | 
 | Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/mem-trace` | 
 | and upload to the [Perfetto UI](https://ui.perfetto.dev). This will show | 
 | overall stats about system [ION](#ion) usage, and per-process stats to | 
 | expand. Scroll down (or Ctrl-F for) to `com.google.android.GoogleCamera` and | 
 | expand. This will show a timeline for various memory stats for camera. | 
 |  | 
 |  | 
 |  | 
 | We can see that around 2/3 into the trace, the memory spiked (in the | 
 | mem.rss.anon track). This is where I took a photo. This is a good way to see | 
 | how the memory usage of an application reacts to different triggers. | 
 |  | 
 | ## Which tool to use | 
 |  | 
 | If you want to drill down into _anonymous_ memory allocated by Java code, | 
 | labeled by `dumpsys meminfo` as `Dalvik Heap`, see the | 
 | [Analyzing the java heap](#java-hprof) section. | 
 |  | 
 | If you want to drill down into _anonymous_ memory allocated by native code, | 
 | labeled by `dumpsys meminfo` as `Native Heap`, see the | 
 | [Analyzing the Native Heap](#heapprofd) section. Note that it's frequent to end | 
 | up with native memory even if your app doesn't have any C/C++ code. This is | 
 | because the implementation of some framework API (e.g. Regex) is internally | 
 | implemented through native code. | 
 |  | 
 | If you want to drill down into file-mapped memory the best option is to use | 
 | `adb shell showmap PID` (on Android) or inspect `/proc/PID/smaps`. | 
 |  | 
 |  | 
 | ## {#lmk} Low-memory kills | 
 |  | 
 | When an Android device becomes low on memory, a daemon called `lmkd` will | 
 | start killing processes in order to free up memory. Devices' strategies differ, | 
 | but in general processes will be killed in order of descending `oom_score_adj` | 
 | score (i.e. background apps and processes first, foreground processes last). | 
 |  | 
 | Apps on Android are not killed when switching away from them. They instead | 
 | remain *cached* even after the user finishes using them. This is to make | 
 | subsequent starts of the app faster. Such apps will generally be killed | 
 | first (because they have a higher `oom_score_adj`). | 
 |  | 
 | We can collect information about LMKs and `oom_score_adj` using Perfetto. | 
 |  | 
 | ```protobuf | 
 | $ adb shell perfetto \ | 
 |   -c - --txt \ | 
 |   -o /data/misc/perfetto-traces/trace \ | 
 | <<EOF | 
 |  | 
 | buffers: { | 
 |     size_kb: 8960 | 
 |     fill_policy: DISCARD | 
 | } | 
 | buffers: { | 
 |     size_kb: 1280 | 
 |     fill_policy: DISCARD | 
 | } | 
 | data_sources: { | 
 |     config { | 
 |         name: "linux.process_stats" | 
 |         target_buffer: 1 | 
 |         process_stats_config { | 
 |             scan_all_processes_on_start: true | 
 |         } | 
 |     } | 
 | } | 
 | data_sources: { | 
 |     config { | 
 |         name: "linux.ftrace" | 
 |         ftrace_config { | 
 |             ftrace_events: "lowmemorykiller/lowmemory_kill" | 
 |             ftrace_events: "oom/oom_score_adj_update" | 
 |             ftrace_events: "ftrace/print" | 
 |             atrace_apps: "lmkd" | 
 |         } | 
 |     } | 
 | } | 
 | duration_ms: 60000 | 
 |  | 
 | EOF | 
 | ``` | 
 |  | 
 | Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/oom-trace` | 
 | and upload to the [Perfetto UI](https://ui.perfetto.dev). | 
 |  | 
 |  | 
 |  | 
 | We can see that the OOM score of Camera gets reduced (making it less likely | 
 | to be killed) when it is opened, and gets increased again once it is closed. | 
 |  | 
 | ## {#heapprofd} Analyzing the Native Heap | 
 |  | 
 | **Native Heap Profiles require Android 10.** | 
 |  | 
 | NOTE: For detailed instructions about the native heap profiler and | 
 |       troubleshooting see the [Data sources > Native heap profiler]( | 
 |       /docs/data-sources/native-heap-profiler.md) page. | 
 |  | 
 | Applications usually get memory through `malloc` or C++'s `new` rather than | 
 | directly getting it from the kernel. The allocator makes sure that your memory | 
 | is more efficiently handled (i.e. there are not many gaps) and that the | 
 | overhead from asking the kernel remains low. | 
 |  | 
 | We can log the native allocations and frees that a process does using | 
 | *heapprofd*. The resulting profile can be used to attribute memory usage | 
 | to particular function callstacks, supporting a mix of both native and Java | 
 | code. The profile *will only show allocations done while it was running*, any | 
 | allocations done before will not be shown. | 
 |  | 
 | ### {#capture-profile-native} Capturing the profile | 
 |  | 
 | Use the `tools/heap_profile` script to profile a process. If you are having | 
 | trouble make sure you are using the [latest version]( | 
 | https://raw.githubusercontent.com/google/perfetto/master/tools/heap_profile). | 
 | See all the arguments using `tools/heap_profile -h`, or use the defaults | 
 | and just profile a process (e.g. `system_server`): | 
 |  | 
 | ```bash | 
 | $ tools/heap_profile -n system_server | 
 |  | 
 | Profiling active. Press Ctrl+C to terminate. | 
 | You may disconnect your device. | 
 |  | 
 | Wrote profiles to /tmp/profile-1283e247-2170-4f92-8181-683763e17445 (symlink /tmp/heap_profile-latest) | 
 | These can be viewed using pprof. Googlers: head to pprof/ and upload them. | 
 | ``` | 
 |  | 
 | When you see *Profiling active*, play around with the phone a bit. When you | 
 | are done, press Ctrl-C to end the profile. For this tutorial, I opened a | 
 | couple of apps. | 
 |  | 
 | ### Viewing the data | 
 |  | 
 | Then upload the `raw-trace` file from the output directory to the | 
 | [Perfetto UI](https://ui.perfetto.dev) and click on diamond marker that | 
 | shows. | 
 |  | 
 |  | 
 |  | 
 | The tabs that are available are | 
 |  | 
 | * **space**: how many bytes were allocated but not freed at this callstack the | 
 |   moment the dump was created. | 
 | * **alloc\_space**: how many bytes were allocated (including ones freed at the | 
 |   moment of the dump) at this callstack | 
 | * **objects**: how many allocations without matching frees were sampled at this | 
 |   callstack. | 
 | * **alloc\_objects**: how many allocations (including ones with matching frees) | 
 |   were sampled at this callstack. | 
 |  | 
 | The default view will show you all allocations that were done while the | 
 | profile was running but that weren't freed (the **space** tab). | 
 |  | 
 |  | 
 |  | 
 | We can see that a lot of memory gets allocated in paths through | 
 | `ResourceManager.loadApkAssets`. To get the total memory that was allocated | 
 | this way, we can enter "loadApkAssets" into the Focus textbox. This will only | 
 | show callstacks where some frame matches "loadApkAssets". | 
 |  | 
 |  | 
 |  | 
 | From this we have a clear idea where in the code we have to look. From the | 
 | code we can see how that memory is being used and if we actually need all of | 
 | it. In this case the key is the `_CompressedAsset` that requires decompressing | 
 | into RAM rather than being able to (_cleanly_) memory-map. By not compressing | 
 | these data, we can save RAM. | 
 |  | 
 | ## {#java-hprof} Analyzing the Java Heap | 
 |  | 
 | **Java Heap Dumps require Android 11.** | 
 |  | 
 | NOTE: For detailed instructions about the Java heap profiler and | 
 |       troubleshooting see the [Data sources > Java heap profiler]( | 
 |       /docs/data-sources/java-heap-profiler.md) page. | 
 |  | 
 | ### {#capture-profile-java} Capturing the profile | 
 | We can get a snapshot of the graph of all the Java objects that constitute the | 
 | Java heap. We use the `tools/java_heap_dump` script. If you are having trouble | 
 | make sure you are using the [latest version]( | 
 | https://raw.githubusercontent.com/google/perfetto/master/tools/java_heap_dump). | 
 |  | 
 | ```bash | 
 | $ tools/java_heap_dump -n com.android.systemui | 
 |  | 
 | Dumping Java Heap. | 
 | Wrote profile to /tmp/tmpup3QrQprofile | 
 | This can be viewed using https://ui.perfetto.dev. | 
 | ``` | 
 |  | 
 | ### Viewing the Data | 
 |  | 
 | Upload the trace to the [Perfetto UI](https://ui.perfetto.dev) and click on | 
 | diamond marker that shows. | 
 |  | 
 |  | 
 |  | 
 | This will present a flamegraph of the memory attributed to the shortest path | 
 | to a garbage-collection root. In general an object is reachable by many paths, | 
 | we only show the shortest as that reduces the complexity of the data displayed | 
 | and is generally the highest-signal. The rightmost `[merged]` stacks is the | 
 | sum of all objects that are too small to be displayed. | 
 |  | 
 |  | 
 |  | 
 | The tabs that are available are | 
 |  | 
 | * **space**: how many bytes are retained via this path to the GC root. | 
 | * **objects**: how many objects are retained via this path to the GC root. | 
 |  | 
 | If we want to only see callstacks that have a frame that contains some string, | 
 | we can use the Focus feature. If we want to know all allocations that have to | 
 | do with notifications, we can put "notification" in the Focus box. | 
 |  | 
 | As with native heap profiles, if we want to focus on some specific aspect of the | 
 | graph, we can filter by the names of the classes. If we wanted to see everything | 
 | that could be caused by notifications, we can put "notification" in the Focus box. | 
 |  | 
 |  | 
 |  | 
 | We aggregate the paths per class name, so if there are multiple objects of the | 
 | same type retained by a `java.lang.Object[]`, we will show one element as its | 
 | child, as you can see in the leftmost stack above. |