Last modified: 2014-08-14 20:47:23 UTC
We keep multiple older php-1.XwmfY branches checked out in /a/common to support static assets that may be referenced by pages in the Varnish cache. The sheer number of files contained in these branches causes a non-trivial additional time cost for each scap sync. Since we are only keeping these branches around to support static asset delivery, it seems possible to add a cleanup step to the train deploys to prune the files in the inactive branches that are only needed for active runtime. This is primarily the php files but could also include javascript, json, sql, all tests and possibly additional file types. Here's a quick comparison: tin:/a/common/php-1.23wmf21 (git wmf/1.23wmf21) bd808$ find . -type d -name .git -prune -o -type f -print|wc -l 35775 tin:/a/common/php-1.23wmf21 (git wmf/1.23wmf21) bd808$ find . -type d -name .git -prune -name tests -prune -o -type f -not -name '*.php' -not -name '*.json' -not -name '*.js' -not -name '*.sql' -print|wc -l 6326 In this case getting rid of php, json, js, sql and tests would reduce the number of files compared for a sync by almost 30,000 (~80%).
With the new functionality in sync-common where --include=<directory> can be passed to the leaf hosts to modify the files/directories that are being asked for synchronization. Scap could use it's knowledge of active branches to exclude php-1.XwmfY branches from syncing that are not actively in use on the cluster. Doing this would probably need to also include of branches that are newer than active (eg the branch that *will* be live on the group0 hosts soon).