Monorepo History Pruning

These are the steps we will/did follow to prune excessive history from the flutter/engine repository when we merged with flutter/flutter. The idea was to retain as much useful history as possible without blowing up the footprint of the framework's .git folder. The history that should get merged should be as relative and useful to currently engine development.

The engine .git folder is ~780MB of history.

  • Binary files were checked in that are not used anymore.
  • Third party librariers were checked in and removed nearly a decade ago.
  • Examples were created and later moved elsewhere.

Step 1: Fresh Clone + Safety

Do not start with your working tree. Remove the origin so we don't mess with the flutter/engine.

##############################################
## Do some cleanup work on the engine and get
## the folder structure right.
##############################################

# clone the repo to a fresh working folder
git clone git@github.com:flutter/engine.git engine_prep
cd engine_prep

# for saftey - remove the remote - we're going to edit history
git remote remove origin

Optional - Analyze the repo

If you want to analyze the repository, you should intall git filter-repo on your path and then run:

# Analyze if you want, just remember to remove .git/filter-repo
git filter-repo --analyze --force

The output is stored in .git/filter-repo.

Step 2: Prune the History

The following table is pulled from git-filter-repo's analsis. The Packed Size due to cross referencing. In general; we looked at large files that are not referenced any more and folders older than 2016.

Packed SizeDeleted DatePathNotes
1127847452024-05-13ci/licenses_golden/licenses_third_party
27531902~2021*.jarbinary
273799312016-08-09third_party/android_platformandroid_platform and webview
270000002024-07-15impeller/docs/assets/*.(pnggif)
151213752023-02-13*.ttcfont files
101041822023-02-13/SourceHanSerifCN
79856822018-08-08travisold ci
63156372015-11-07examples/game
39394292015-07-28sky/sdk
39394292015-07-28sky/packages/sky
39037872016-08-09mojo
36868302022-06-14testing/scenario_app/android/reports
31889302015-06-30tests/fast
31739662015-08-07/example/game
20189612016-08-09third_party/libxml
18041992016-08-09third_party/tcmalloc
1393936~2016*.dllbinary
13737402017-07-06tests/data
11006652015-06-27benchmarks/parser/resources/html5.html
10596732015-07-20third_party/protobuf
9788702022-04-27impeller/third_party
7988522015-07-20third_party/cython
7785602022-01-24lib/web_ui/test/golden_files
6344552016-08-09third_party/libpng
6107512024-05-13.golden
5504752024-09-17impeller/fixtures/flutter_logo_baked.*
5268372016-08-09third_party/libevent
5234362015-07-20third_party/boringssl
5149682022-04-27impeller/fixtures/image.png
4615272015-12-11third_party/re2
4181222015-10-12examples/demo_launcher
4137872015-11-07.aac
3627872016-08-09third_party/glfw
3496042016-08-09third_party/harfbuzz-ng
3408692016-08-09third_party/okhttp
3216592016-08-09.S
3008242016-08-09.so
2576332016-08-09third_party/libjpeg
2575192016-08-09third_party/jinja2
2496182016-08-09third_party/zlib
2186432015-12-11third_party/brotli
1886222021-01-06.idl
1845932015-09-02third_party/khronos
1732102016-08-09.gypi
1704842016-08-09third_party/expat
1695782016-08-09.asm
1613602016-08-09.m4
1426702018-05-10.in
1403642015-12-11third_party/ots
1372702016-08-09.hh
1367872016-08-09.gyp
995032016-08-09third_party/qcms
917302015-08-21.pxd
848502016-08-09third_party/yasm

The following command will remove files and foldes from the checkout history. Since this is a destructive edit, the SHA1 git hashes will be changed in the process. At the end, the .git history will be 74 MB of object files.

# Lets do some heavy filtering;
# .git starts out at ~780MB and ends up at ~110MB
git filter-repo  --force --invert-paths \
--path-glob 'impeller/docs/assets/*.png' \
--path-glob 'impeller/docs/assets/*.gif' \
--path-glob '*/example/game/*' \
--path-glob 'benchmarks/parser/resources/html5.html' \
--path-glob '*.dll' \
--path-glob '*.jar' \
--path-glob '*/SourceHanSerifCN*' \
--path-glob 'third_party/txt/third_party/fonts/NotoSansCJK-Regular.ttc' \
--path-glob 'impeller/fixtures/flutter_logo_baked.*' \
--path-glob 'impeller/fixtures/image.png' \
--path-glob '*.golden' \
--path-glob '*.aac' \
--path-glob '*.S' \
--path-glob '*.so' \
--path-glob '*.idl' \
--path-glob '*.gpy' \
--path-glob '*.gypi' \
--path-glob '*.asm' \
--path-glob '*.m4' \
--path-glob '*.in' \
--path-glob '*.pxd' \
--path-glob '*.hh' \
--path 'ci/licenses_golden/licenses_third_party' \
--path 'testing/scenario_app/android/reports' \
--path 'impeller/third_party' \
--path 'mojo/public/third_party' \
--path 'tests/data' \
--path 'tests/fast' \
--path 'tests/framework' \
--path 'travis' \
--path 'mojo' \
--path 'sky/sdk' \
--path 'sky/engine' \
--path 'sky/tools/webkitpy' \
--path 'sky/shell' \
--path 'sky/packages/sky' \
--path 'sky/tests' \
--path 'sky/unit' \
--path 'sky/services' \
--path 'sky/compositor' \
--path 'sky/build' \
--path 'sky/specs' \
--path 'skysprites' \
--path 'examples/demo_launcher' \
--path 'examples/game' \
--path 'third_party/qcms' \
--path 'third_party/libevent' \
--path 'third_party/boringssl' \
--path 'third_party/tcmalloc' \
--path 'third_party/cython' \
--path 'third_party/protobuf' \
--path 'third_party/libpng' \
--path 'third_party/re2' \
--path 'third_party/harfbuzz-ng' \
--path 'third_party/jinja2' \
--path 'third_party/libjpeg' \
--path 'third_party/glfw' \
--path 'third_party/zlib' \
--path 'third_party/android_platform' \
--path 'third_party/expat' \
--path 'third_party/brotli' \
--path 'third_party/yasm' \
--path 'third_party/khronos' \
--path 'third_party/okhttp' \
--path 'third_party/libxml' \
--path 'third_party/ots' \
--path 'third_party/libXNVCtrl' \
--path 'lib/web_ui/test/golden_files' \
--path 'apk' \
--path 'flutter' \
--path 'base' \
--path 'sdk' \
--path 'gpu' \
--path 'engine' \
--path 'tools/webkitpy' \
--path 'tools/valgrind' \
--path 'tools/clang' \
--path 'tools/android' \
--path 'build/linux' \
--path 'build/win' \
--path 'build/mac' \
--path 'ui' \
--path 'examples/stocks' \
--path 'examples/stocks2' \
--path 'examples/stocks-fn' \
--path 'examples/data' \
--path 'examples/fitness' \
--path 'examples/city-list' \
--path 'examples/widgets' \
--path 'examples/raw' \
--path 'examples/color' \
--path 'examples/flights' \
--path 'examples/rendering' \
--path 'examples/fn' \
--path 'specs' \
--path 'url' \
--path 'services' \
--path 'framework' \
--path 'crypto' \
--path 'skia/ext' \
--path 'e2etests' \
--path 'tests/resources' \
--path 'viewer' \
--path 'lib/stub_ui' \
--path 'content_handler'

# Garbage collect!
git reflog expire --expire=now --all && git gc --prune=now --aggressive

Step 3 - Rewirte directories

The final destination for the engine source code will be in the directory engine/src/flutter except for DEPS which remains at the root. Using git mv only affects HEAD and can have some problems when logging. Instead we'll re-write history so it makes sense in the new world.

# Move files to engine/src/flutter, update tags so they don't collide, and move DEPS back to root.
git filter-repo  --to-subdirectory-filter engine/src/flutter --tag-rename '':'engine-' --force
git filter-repo --path-rename engine/src/flutter/DEPS:DEPS

Step 4 - Rewrite the PR links

The PR link in the first line of the comment message will be wrong; flutter/flutter doesn't have the same history. To make history a little bit better, we only want to edit the first line. This must be done before we merge with the flutter/flutter repo so as not to step on their commit lines.

git filter-repo --force --message-callback '
    return re.sub(br"^(.*)\((#\d+)\)\n(.*)", br"\1(flutter/engine\2)\n\3", message, 1)
    '

Execute Order 42: Merge The Repositories

##############################################
## Now handle merging into flutter/flutter
##############################################

git clone git@github.com:flutter/flutter.git flutter_merge
cd flutter_merge

# add the other tree as remote
git remote add -f engine-upstream ~/src/engine_prep

# --no-commit is important because we want to look around
git merge --no-commit --allow-unrelated-histories engine-upstream/main

# You're a wizard, Harry
git commit -m "Merge flutter/engine into framework"

# Garbage collect!
# Now at 234MB .git
git reflog expire --expire=now --all && git gc --prune=now --aggressive