traced_perf: move unwinding to a dedicated thread This splits the single-threaded traced_perf into two threads: * a primary thread that does the initial kernel buffer reading, as well as the final interning/serialization, and IPC. * an unwinder thread that sits in-between in terms of the dataflow. The reasoning is that unwinding is very long-tailed, and we don't want to starve the kernel buffer reading/IPC functions of the producer while sitting in a 1s-long unwind on Android. The unwinder uses a ring queue for the input samples (and is woken up by the primary thread after it pushes a batch of samples). Once a sample is unwound, it's posted directly back to the main thread (might need to consider batching here as well if we don't want many staggered wakeups). Note on the unwinding queue: this approach is enqueueing parsed samples (with complex types like the unique_ptr and vector). An alternative considered was to make the queue entries have their original kernel format (so a direct memcpy from the kernel ring buffer). I can see a variety of pros and cons for both approaches (which I won't summarize here), but ultimately decided on keeping the early parsing into a "complex" type, and dealing with just that type post-EventReader. The pid-tracking is done primarily on the primary thread, but a subset (pretty much ready vs expired) of updates is replicated to the Unwinder (which acts as a listener, without pushing any updates of its own). I considered two approach for the unwinding queues: a single shared queue (as posted), and per-DataSource queues that would be created by the primary thread, and adopted by the unwinder. There's an argument that separate queues would be more fair when there are concurrent data sources, and the load is too high. On the other hand, we will still ultimately want a process-wide cap on the amount of inflight samples, so a single queue shortcuts to that (at the expense of some fairness). UnwindingHandle is a temporary copy-paste. I'm hoping to get rid of it within a week (but it's a separate conversation on base::ThreadTaskRunner API that I don't want blocking this patch). Unwindstack caching is removed temporarily (since during reconnects, we might be moving unwinding between threads while recreating the Unwinder), will fix in a follow-up. Note to reviewer: I'm not very confident about most of the file/class naming choices. Please criticize the inconsistencies without reservation. Bug: 144281346 Change-Id: I4f59d1b4d52cf589fbe60e78ad4c1ee0b9994c0a

commit: ccd89612055ff167c73f4bbc374a38e86de65556 [log] [tgz]
author: Ryan Savitski <rsavitski@google.com> Mon Mar 09 18:31:47 2020 +0000
committer: Ryan Savitski <rsavitski@google.com> Mon Mar 09 18:31:47 2020 +0000
tree: f3c46534cc56441a57514c27533dc5d780024237
parent: 95f126da184ea1050e7644910521fde3008e7fcc [diff] [blame]
diff --git a/Android.bp b/Android.bp
index 6f56224..40057d4 100644
--- a/Android.bp
+++ b/Android.bp

@@ -5926,6 +5926,11 @@
   ],
 }
 
+// GN: //src/profiling/perf:common_types
+filegroup {
+  name: "perfetto_src_profiling_perf_common_types",
+}
+
 // GN: //src/profiling/perf:proc_descriptors
 filegroup {
   name: "perfetto_src_profiling_perf_proc_descriptors",
@@ -5971,6 +5976,9 @@
 // GN: //src/profiling/perf:unwinding
 filegroup {
   name: "perfetto_src_profiling_perf_unwinding",
+  srcs: [
+    "src/profiling/perf/unwinding.cc",
+  ],
 }
 
 // GN: //src/profiling/symbolizer:symbolize_database
@@ -7239,6 +7247,7 @@
     ":perfetto_src_profiling_memory_scoped_spinlock",
     ":perfetto_src_profiling_memory_unittests",
     ":perfetto_src_profiling_memory_wire_protocol",
+    ":perfetto_src_profiling_perf_common_types",
     ":perfetto_src_profiling_perf_proc_descriptors",
     ":perfetto_src_profiling_perf_producer",
     ":perfetto_src_profiling_perf_producer_unittests",
@@ -7749,10 +7758,12 @@
     ":perfetto_src_profiling_common_interning_output",
     ":perfetto_src_profiling_common_proc_utils",
     ":perfetto_src_profiling_common_unwind_support",
+    ":perfetto_src_profiling_perf_common_types",
     ":perfetto_src_profiling_perf_proc_descriptors",
     ":perfetto_src_profiling_perf_producer",
     ":perfetto_src_profiling_perf_regs_parsing",
     ":perfetto_src_profiling_perf_traced_perf_main",
+    ":perfetto_src_profiling_perf_unwinding",
     ":perfetto_src_protozero_protozero",
     ":perfetto_src_tracing_common",
     ":perfetto_src_tracing_core_core",
commit	ccd89612055ff167c73f4bbc374a38e86de65556	[log] [tgz]
author	Ryan Savitski <rsavitski@google.com>	Mon Mar 09 18:31:47 2020 +0000
committer	Ryan Savitski <rsavitski@google.com>	Mon Mar 09 18:31:47 2020 +0000
tree	f3c46534cc56441a57514c27533dc5d780024237
parent	95f126da184ea1050e7644910521fde3008e7fcc [diff] [blame]