Last modified: 2012-03-21 21:45:59 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36472, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34472 - test/mediawiki/core2 has some big objects
test/mediawiki/core2 has some big objects
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Git/Gerrit (Other open bugs)
unspecified
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 34473
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-17 15:32 UTC by Antoine "hashar" Musso (WMF)
Modified: 2012-03-21 21:45 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Antoine "hashar" Musso (WMF) 2012-02-17 15:32:00 UTC
The test/mediawiki/core2 has some big objects which were uploaded to svn by mistake. There is at least an OGG video and a few MB parserTests file.

We might want to drop those objects to shrunken the repo size.

One can found the object sha1 using verify-pack:

git verify-pack -v .git/objects/pack/pack-4d812fd3351b2f9b9814dd4c4370554c5bf3bc8a.idx | sort -k3n | tail -n 10


deacc7523dc11f0aa72fabf09c9eb142ea501b6d blob   574106 145627 85947251
a34d599e7baee042336dd9d7ba5823a338e6d568 blob   596528 563665 127474685
c105546577035b0c140885e7784c6aa8c1bd6e3c blob   696872 696627 85017232
d41cf97eda958e808e47a1b26b4e1faf57b1872d blob   831473 731654 128085310
7547e5d52614b85bc569f319e6c90ffae0d22e74 blob   859358 561624 68971517
d9f76fe3685a08d233d57f79ed09458cb89b9ee8 blob   867490 234202 70979864
6474bca35d96def65b5bd5c4057570feacd97b0e blob   1039405 283187 60074558
f73456d5912da3406fcf7a1d557253c8cd7d0130 blob   1375400 384009 23574243
442cecf1c118e06e76baffd8c16e75a5fa65b953 blob   2152337 2145045 94611352
f0742f706d752466bb9ba782abce5db2e52642c4 blob   8001431 3756081 73016860

The 3rd field is uncompressed size, 4th one compressed size. So the last two are 2,1MB and 3.7MB files.

$ git show f0742f7 | head -n1
# This is another parserTest file.

A commit by someone to parserTests.php that inserted a 8MB test case :(


$ git show 442cecf | head -c 4
OggS

That is a theora encoded video of some fishes (git show 442cecf > /tmp/fishes.ogg then open it in VLC).
Comment 1 Chad H. 2012-02-17 15:33:04 UTC
If you can find a way to drop these from the repo properly, I'm all for it.
Comment 2 Antoine "hashar" Musso (WMF) 2012-02-21 14:34:57 UTC
For references http://stackoverflow.com/questions/2164581/remove-file-from-git-repository-history
Comment 3 Antoine "hashar" Musso (WMF) 2012-03-12 16:12:57 UTC
It looks like we can drop the .ogg video by using the following filtering command:

git filter-branch --prune-empty \
  --index-filter \
  'git rm -rf --cached --ignore-unmatch js2/mwEmbed/example_usage/media/*.ogg' \
  --tag-name-filter cat \
  -- --all
Comment 4 Antoine "hashar" Musso (WMF) 2012-03-12 16:24:36 UTC
The huge ExtraParserTests.txt file is mentioned in bug 23715. Probably added by r67014. It was removed by r67091.

The easiest would probably be to use 'git rebase', cancel that huge additions and amend the commit message with a nice note such as :  "Year 2012: there used to be a huge file there that was replaced by a nicer str_repeat() call :]"
Comment 5 Antoine "hashar" Musso (WMF) 2012-03-21 21:45:59 UTC
The ExtraParserTests  f0742f706d752466bb9ba782abce5db2e52642c4 is still in :-/  Although I tested rebasing friday and tried both today to fix it, I have encountered a blocker with the final cut. An unrelated path conflict later on during the rebase :-/

Being pragmatic, we skipped filtering that one.

The ogg files have been filtered out though!  So that is almost fixed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links