Last modified: 2012-03-21 21:45:59 UTC
The test/mediawiki/core2 has some big objects which were uploaded to svn by mistake. There is at least an OGG video and a few MB parserTests file. We might want to drop those objects to shrunken the repo size. One can found the object sha1 using verify-pack: git verify-pack -v .git/objects/pack/pack-4d812fd3351b2f9b9814dd4c4370554c5bf3bc8a.idx | sort -k3n | tail -n 10 deacc7523dc11f0aa72fabf09c9eb142ea501b6d blob 574106 145627 85947251 a34d599e7baee042336dd9d7ba5823a338e6d568 blob 596528 563665 127474685 c105546577035b0c140885e7784c6aa8c1bd6e3c blob 696872 696627 85017232 d41cf97eda958e808e47a1b26b4e1faf57b1872d blob 831473 731654 128085310 7547e5d52614b85bc569f319e6c90ffae0d22e74 blob 859358 561624 68971517 d9f76fe3685a08d233d57f79ed09458cb89b9ee8 blob 867490 234202 70979864 6474bca35d96def65b5bd5c4057570feacd97b0e blob 1039405 283187 60074558 f73456d5912da3406fcf7a1d557253c8cd7d0130 blob 1375400 384009 23574243 442cecf1c118e06e76baffd8c16e75a5fa65b953 blob 2152337 2145045 94611352 f0742f706d752466bb9ba782abce5db2e52642c4 blob 8001431 3756081 73016860 The 3rd field is uncompressed size, 4th one compressed size. So the last two are 2,1MB and 3.7MB files. $ git show f0742f7 | head -n1 # This is another parserTest file. A commit by someone to parserTests.php that inserted a 8MB test case :( $ git show 442cecf | head -c 4 OggS That is a theora encoded video of some fishes (git show 442cecf > /tmp/fishes.ogg then open it in VLC).
If you can find a way to drop these from the repo properly, I'm all for it.
For references http://stackoverflow.com/questions/2164581/remove-file-from-git-repository-history
It looks like we can drop the .ogg video by using the following filtering command: git filter-branch --prune-empty \ --index-filter \ 'git rm -rf --cached --ignore-unmatch js2/mwEmbed/example_usage/media/*.ogg' \ --tag-name-filter cat \ -- --all
The huge ExtraParserTests.txt file is mentioned in bug 23715. Probably added by r67014. It was removed by r67091. The easiest would probably be to use 'git rebase', cancel that huge additions and amend the commit message with a nice note such as : "Year 2012: there used to be a huge file there that was replaced by a nicer str_repeat() call :]"
The ExtraParserTests f0742f706d752466bb9ba782abce5db2e52642c4 is still in :-/ Although I tested rebasing friday and tried both today to fix it, I have encountered a blocker with the final cut. An unrelated path conflict later on during the rebase :-/ Being pragmatic, we skipped filtering that one. The ogg files have been filtered out though! So that is almost fixed.