traced_perf: feed samples through libunwindstack

Added per-DS bookkeeping queues with the same structure of periodic tick
tasks that process a queue of unwound samples.

All the per-instance maps have the same lifetime at the moment, but I'm
keeping them separate as the ownership will become more segmented once the
unwinding is put onto a separate thread (and ProcDescriptors will be
sharded, so keeping the current name for now).

Change-Id: Id91b7f440f68317ff429060eece1754922f28594
diff --git a/src/profiling/perf/event_reader.cc b/src/profiling/perf/event_reader.cc
index c28011b..cab7e35 100644
--- a/src/profiling/perf/event_reader.cc
+++ b/src/profiling/perf/event_reader.cc
@@ -130,6 +130,8 @@
 // Is there an argument for maintaining our own copy of |data_tail| instead of
 // reloading it?
 char* PerfRingBuffer::ReadRecordNonconsuming() {
+  static_assert(sizeof(std::atomic<uint64_t>) == sizeof(uint64_t), "");
+
   PERFETTO_CHECK(valid());
 
   // |data_tail| is written only by this userspace thread, so we can safely read