| # Flutter DeviceLab |
| |
| DeviceLab is a physical lab that tests Flutter on real devices. |
| |
| This package contains the code for the test framework and tests. More generally |
| the tests are referred to as "tasks" in the API, but since we primarily use it |
| for testing, this document refers to them as "tests". |
| |
| Current statuses for the devicelab are available at |
| https://flutter-dashboard.appspot.com. See [dashboard user guide](https://github.com/flutter/cocoon/blob/master/app_flutter/USER_GUIDE.md) |
| for information on using the dashboards. |
| |
| ## How the DeviceLab runs tasks |
| |
| The DeviceLab devices continuously ask Flutter's continuous integration system |
| [Cocoon](https://github.com/flutter/cocoon) for tasks to run. When Cocoon has a |
| task that is suitable for the device (e.g. Android test), it reserves that |
| task for the device. See [manifest.yaml](manifest.yaml) for more information on |
| the information used for scheduling tasks. |
| |
| 1. If the task succeeds, the test runner reports the success to Cocoon. The dashboards |
| will show that task in green. |
| 2. If the task fails, the test runner reports the failure to the server. Cocoon |
| increments the run attempt counter and puts the task back in the pool of available |
| tasks. If a task does not succeed after a certain number of attempts (as of this writing the limit is 2), |
| the task is marked as failed and is displayed using a red color on the dashboard. |
| |
| ## Running tests locally |
| |
| Do make sure your tests pass locally before deploying to the CI environment. |
| Below is a handful of commands that run tests in a similar way to how the |
| CI environment runs them. These commands are also useful when you need to |
| reproduce a CI test failure locally. |
| |
| ### Prerequisites |
| |
| You must set the `ANDROID_SDK_ROOT` environment variable to run |
| tests on Android. If you have a local build of the Flutter engine, then you have |
| a copy of the Android SDK at `.../engine/src/third_party/android_tools/sdk`. |
| |
| You can find where your Android SDK is using `flutter doctor`. |
| |
| ### Warnings |
| |
| Running the devicelab will do things to your environment. |
| |
| Notably, it will start and stop Gradle, for instance. |
| |
| ### Running all tests |
| |
| To run all tests defined in `manifest.yaml`, use option `-a` (`--all`): |
| |
| ```sh |
| ../../bin/cache/dart-sdk/bin/dart bin/run.dart -a |
| ``` |
| |
| This defaults to only running tests supported by your host device's platform |
| (`--match-host-platform`) and exiting after the first failure (`--exit`). |
| |
| ### Running specific tests |
| |
| To run a test, use option `-t` (`--task`): |
| |
| ```sh |
| # from the .../flutter/dev/devicelab directory |
| ../../bin/cache/dart-sdk/bin/dart bin/run.dart -t {NAME_OR_PATH_OF_TEST} |
| ``` |
| |
| Where `NAME_OR_PATH_OF_TEST` can be either of: |
| |
| - the _name_ of a task, which you can find in the `manifest.yaml` file in this |
| directory. Example: `complex_layout__start_up`. |
| - the path to a Dart _file_ corresponding to a task, which resides in `bin/tasks`. |
| Tip: most shells support path auto-completion using the Tab key. Example: |
| `bin/tasks/complex_layout__start_up.dart`. |
| |
| To run multiple tests, repeat option `-t` (`--task`) multiple times: |
| |
| ```sh |
| ../../bin/cache/dart-sdk/bin/dart bin/run.dart -t test1 -t test2 -t test3 |
| ``` |
| |
| To run tests from a specific stage, use option `-s` (`--stage`). |
| Currently, there are only three stages defined, `devicelab`, |
| `devicelab_ios` and `devicelab_win`. |
| |
| ```sh |
| ../../bin/cache/dart-sdk/bin/dart bin/run.dart -s {NAME_OF_STAGE} |
| ``` |
| |
| ### Running tests against a local engine build |
| |
| To run device lab tests against a local engine build, pass the appropriate |
| flags to `bin/run.dart`: |
| |
| ```sh |
| ../../bin/cache/dart-sdk/bin/dart bin/run.dart --task=[some_task] \ |
| --local-engine-src-path=[path_to_local]/engine/src \ |
| --local-engine=[local_engine_architecture] |
| ``` |
| |
| An example of a local engine architecture is `android_debug_unopt_x86`. |
| |
| ### Running an A/B test for engine changes |
| |
| You can run an A/B test that compares the performance of the default engine |
| against a local engine build. The test runs the same benchmark a specified |
| number of times against both engines, then outputs a tab-separated spreadsheet |
| with the results and stores them in a JSON file for future reference. The |
| results can be copied to a Google Spreadsheet for further inspection and the |
| JSON file can be reprocessed with the `summarize.dart` command for more detailed |
| output. |
| |
| Example: |
| |
| ```sh |
| ../../bin/cache/dart-sdk/bin/dart bin/run.dart --ab=10 \ |
| --local-engine=host_debug_unopt \ |
| -t bin/tasks/web_benchmarks_canvaskit.dart |
| ``` |
| |
| The `--ab=10` tells the runner to run an A/B test 10 times. |
| |
| `--local-engine=host_debug_unopt` tells the A/B test to use the `host_debug_unopt` |
| engine build. `--local-engine` is required for A/B test. |
| |
| `--ab-result-file=filename` can be used to provide an alternate location to output |
| the JSON results file (defaults to `ABresults#.json`). A single `#` character can be |
| used to indicate where to insert a serial number if a file with that name already |
| exists, otherwise, the file will be overwritten. |
| |
| A/B can run exactly one task. Multiple tasks are not supported. |
| |
| Example output: |
| |
| ``` |
| Score Average A (noise) Average B (noise) Speed-up |
| bench_card_infinite_scroll.canvaskit.drawFrameDuration.average 2900.20 (8.44%) 2426.70 (8.94%) 1.20x |
| bench_card_infinite_scroll.canvaskit.totalUiFrame.average 4964.00 (6.29%) 4098.00 (8.03%) 1.21x |
| draw_rect.canvaskit.windowRenderDuration.average 1959.45 (16.56%) 2286.65 (0.61%) 0.86x |
| draw_rect.canvaskit.sceneBuildDuration.average 1969.45 (16.37%) 2294.90 (0.58%) 0.86x |
| draw_rect.canvaskit.drawFrameDuration.average 5335.20 (17.59%) 6437.60 (0.59%) 0.83x |
| draw_rect.canvaskit.totalUiFrame.average 6832.00 (13.16%) 7932.00 (0.34%) 0.86x |
| ``` |
| |
| The output contains averages and noises for each score. More importantly, it |
| contains the speed-up value, i.e. how much _faster_ is the local engine than |
| the default engine. Values less than 1.0 indicate a slow-down. For example, |
| 0.5x means the local engine is twice as slow as the default engine, and 2.0x |
| means it's twice as fast. Higher is better. |
| |
| Summarize tool example: |
| |
| ```sh |
| ../../bin/cache/dart-sdk/bin/dart bin/summarize.dart --[no-]tsv-table --[no-]raw-summary \ |
| ABresults.json ABresults1.json ABresults2.json ... |
| ``` |
| |
| `--[no-]tsv-table` tells the tool to print the summary in a table with tabs for easy spreadsheet |
| entry. (defaults to on) |
| |
| `--[no-]raw-summary` tells the tool to print all per-run data collected by the A/B test formatted |
| with tabs for easy spreadsheet entry. (defaults to on) |
| |
| Multiple trailing filenames can be specified and each such results file will be processed in turn. |
| |
| ## Reproducing broken builds locally |
| |
| To reproduce the breakage locally `git checkout` the corresponding Flutter |
| revision. Note the name of the test that failed. In the example above the |
| failing test is `flutter_gallery__transition_perf`. This name can be passed to |
| the `run.dart` command. For example: |
| |
| ```sh |
| ../../bin/cache/dart-sdk/bin/dart bin/run.dart -t flutter_gallery__transition_perf |
| ``` |
| |
| ## Writing tests |
| |
| A test is a simple Dart program that lives under `bin/tasks` and uses |
| `package:flutter_devicelab/framework/framework.dart` to define and run a _task_. |
| |
| Example: |
| |
| ```dart |
| import 'dart:async'; |
| |
| import 'package:flutter_devicelab/framework/framework.dart'; |
| |
| Future<void> main() async { |
| await task(() async { |
| ... do something interesting ... |
| |
| // Aggregate results into a JSONable Map structure. |
| Map<String, dynamic> testResults = ...; |
| |
| // Report success. |
| return new TaskResult.success(testResults); |
| |
| // Or you can also report a failure. |
| return new TaskResult.failure('Something went wrong!'); |
| }); |
| } |
| ``` |
| |
| Only one `task` is permitted per program. However, that task can run any number |
| of tests internally. A task has a name. It succeeds and fails independently of |
| other tasks, and is reported to the dashboard independently of other tasks. |
| |
| A task runs in its own standalone Dart VM and reports results via Dart VM |
| service protocol. This ensures that tasks do not interfere with each other and |
| lets the CI system time out and clean up tasks that get stuck. |
| |
| ## Adding tests to the CI environment |
| |
| The `manifest.yaml` file describes a subset of tests we run in the CI. To add |
| your test edit `manifest.yaml` and add the following in the "tasks" dictionary: |
| |
| ``` |
| {NAME_OF_TEST}: |
| description: {DESCRIPTION} |
| stage: {STAGE} |
| required_agent_capabilities: {CAPABILITIES} |
| ``` |
| |
| Where: |
| |
| - `{NAME_OF_TEST}` is the name of your test that also matches the name of the |
| file in `bin/tasks` without the `.dart` extension. |
| - `{DESCRIPTION}` is the plain English description of your test that helps |
| others understand what this test is testing. |
| - `{STAGE}` is `devicelab` if you want to run on Android, or `devicelab_ios` if |
| you want to run on iOS. |
| - `{CAPABILITIES}` is an array that lists the capabilities required of |
| the test agent (the computer that runs the test) to run your test. As of writing, |
| the available capabilities are: `linux`, `linux/android`, `linux-vm`, |
| `mac`, `mac/ios`, `mac/iphonexs`, `mac/ios32`, `mac-catalina/ios`, |
| `mac-catalina/android`, `ios/gl-render-image`, `windows`, `windows/android`. |
| |
| If your test needs to run on multiple operating systems, create a separate test |
| for each operating system. |