Last modified: 2011-10-25 22:39:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T33850, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 31850 - Two > 900 MiB .ogv files on Commons seem corrupted
Two > 900 MiB .ogv files on Commons seem corrupted
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Sam Reed (reedy)
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-20 20:28 UTC by Tomasz W. Kozlowski
Modified: 2011-10-25 22:39 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Tomasz W. Kozlowski 2011-10-20 20:28:46 UTC
[I've assigned this bug to Sam as he helped uploading the files to Commons and seemed interested; Sam, please feel free to unsubscribe if you wish to.]

Two of the 14 videos from the 10th Polish Wikipedia birthday conference that have been uploaded to the Wikimedia Commons as part of [[bug:31568]], namely: 

* <http://commons.wikimedia.org/wiki/File:Jan_Wr%C3%B3bel_-_Koniec_%C5%9Bwiata_nast%C4%85pi_%28wideo%29.ogv> (1.42 GiB)
* <http://commons.wikimedia.org/wiki/File:Tomasz_Ganicz_-_Z_pami%C4%99tnika_weterana_%28wideo%29.ogv> (914.46 GiB)

seem corrupted and cannot be played directly on Commons. You can hear the sound without seeing the image.

Surprisingly, after being downloaded onto a local computer, they can be played well without any problems at all (I am using Ubuntu 10.04 LTS and have checked the files with Totem, VLC and Mplayer). 

All the 14 mentioned files have been converted from .mp4 to .ogv using ffmpeg2theora on a toolserver run by Wikimedia Poland; here is what the ffmpeg -i command shows for the two corrupted files as well as for the good ones:

    Stream #0.0: Invalid Codec type -1
    Stream #0.1: Video: theora, yuv420p, 1920x1080, 30 tbr, 30 tbn, 30 tbc
    Stream #0.2: Audio: vorbis, 44100 Hz, stereo, s16, 499 kb/s

Any ideas what might have gone wrong?
Thanks,
-- Tomasz (odder)
Comment 1 Brion Vibber 2011-10-20 20:41:28 UTC
If I load http://upload.wikimedia.org/wikipedia/commons/6/62/Jan_Wr%C3%B3bel_-_Koniec_%C5%9Bwiata_nast%C4%85pi_%28wideo%29.ogv directly into Firefox 7.0.2, it starts downloading and playing the file just fine...

On the file page though MW shows:

"Jan_Wróbel_-_Koniec_świata_nastąpi_(wideo).ogv‎ (Invalid ogg file: Cannot decode Ogg file: Invalid page at offset 1088569774) "

which is a bit suspicious.

The inline player for me is firing up an <audio> rather than a <video>, probably since it doesn't have width/height info... looks kinda like it is loading the file, but it doesn't show any video since it's an <audio>.

Possibly there is something about the file that's legit breaking our ogg metadata reading, or there's a bug in our ogg metadata reading libraries.
Comment 2 Tomasz W. Kozlowski 2011-10-20 20:55:32 UTC
As shown above, I am using the very same Ubuntu 10.04 LTS, except that on WMF it's a server edition. ffmpeg -i shows the dimensions very clear:

    Stream #0.1: Video: theora, yuv420p, 1920x1080, 30 tbr, 30 tbn, 30 tbc

and the MW works just fine even for bigger files than the ones broken, for instance for: 
* <http://commons.wikimedia.org/wiki/File:Jaros%C5%82aw_Lipszyc_-_Nie_pytaj,_co_Pa%C5%84stwo_mo%C5%BCe_zrobi%C4%87_dla_Wikipedii_%28wideo%29.ogv> 

which is 2.17 GiB in size, it shows:
"(Ogg multiplexed audio/video file, Theora/Vorbis, length 59m 19s, 1,920×1,080 pixels, 5.2Mbps overall)"

I am not very knowledgeable about video files and especially about the MW inline player, but am ready to help as much as I can--including converting the files again.
Comment 3 Sam Reed (reedy) 2011-10-21 01:05:15 UTC
(In reply to comment #0)
> *
> <http://commons.wikimedia.org/wiki/File:Tomasz_Ganicz_-_Z_pami%C4%99tnika_weterana_%28wideo%29.ogv>
> (914.46 GiB)
> 

I hope that isn't that large ;)
Comment 4 Brion Vibber 2011-10-21 01:14:44 UTC
ffmpeg 0.7.2 on Ubuntu 11.10 gives:

Input #0, ogg, from 'long.ogv':
  Duration: 00:25:28.26, start: 0.000000, bitrate: 5019 kb/s
    Stream #0.0: Data: skeleton
    Stream #0.1: Video: theora, yuv420p, 1920x1088, 30 fps, 30 tbr, 30 tbn, 30 tbc
    Stream #0.2: Audio: vorbis, 44100 Hz, stereo, s16, 499 kb/s
    Metadata:
      ENCODER         : ffmpeg2theora-0.25
      SOURCE_OSHASH   : 5fb6cf8e3ec983d6

So that mysterious 'invalid' stuff on stream 0.0 looks like a perfectly legit metadata skeleton.

Importing the two .ogv files into my local trunk install with OggHandler enabled... they come in, but both end up with bogus metadata reported:

Med.ogv‎ (Invalid ogg file: Cannot decode Ogg file: Invalid page at offset 549624967) 
Longer.ogv‎ (Invalid ogg file: Cannot decode Ogg file: Invalid page at offset 1088569774) 

These are the same offsets that are shown on the live file pages, so common code in our metadata extraction must be hitting them.

It's possible that it's a bug in our code; it's also possible that both files are in fact corrupt, but can be played back by skipping over the bad frames or such.
Comment 5 Tomasz W. Kozlowski 2011-10-25 16:05:52 UTC
I think that the files may be corrupt, as I've converted them once again using ffmpeg2theora and the md5sums of the files uploaded to the Commons and those I had on my local computer did not match.

I've uploaded the new files to a Wikimedia Poland toolserver at <http://tools.wikimedia.pl/~odder/videos/>. Brion -- can you please check on your local trunk install if they're OK? I am unfortunately not able to check them on my local MediaWiki installation...

Please let me know if those are correct and if possible, please upload them to Commons or let somebody else do it.

Thanks!
-- Tomasz (odder)
Comment 6 Sam Reed (reedy) 2011-10-25 17:13:04 UTC
(In reply to comment #5)
> I think that the files may be corrupt, as I've converted them once again using
> ffmpeg2theora and the md5sums of the files uploaded to the Commons and those I
> had on my local computer did not match.

Unless I'm mistaken, this is right. I'd highly doubt a transcoding project would generate the same files every time. Certainly, if there was any major difference in size, I'd be somewhat concerned
Comment 7 Tomasz W. Kozlowski 2011-10-25 17:22:24 UTC
Well, in fact, there is a huge difference in size between the two pairs of files.

* <http://commons.wikimedia.org/wiki/File:Jan_Wr%C3%B3bel_-_Koniec_%C5%9Bwiata_nast%C4%85pi_%28wideo%29.ogv> is 1,42 GiB in size and
* <http://tools.wikimedia.pl/~odder/videos/Jan%20Wr%C3%B3bel%20-%20Koniec%20%C5%9Bwiata%20nast%C4%85pi%20%28wideo%29.ogv> is surprisingly 2,1 GiB.

while

* <http://commons.wikimedia.org/wiki/File:Tomasz_Ganicz_-_Z_pami%C4%99tnika_weterana_%28wideo%29.ogv> is 914,46 GiB in size and
* <http://tools.wikimedia.pl/~odder/videos/Tomasz%20Ganicz%20-%20Z%20pami%C4%99tnika%20weterana%20%28wideo%29.ogv> is just 847 MiB.

I can't tell what made such a big difference, especially between the two Jan Wróbel files, as I've been using the same command:

   ffmpeg2theora --nice 19 -a 10 -v 10 --optimize  

to convert the original .mp4 files to the .ogv versions.
Comment 8 Tomasz W. Kozlowski 2011-10-25 17:25:58 UTC
Whoops, I've been copying the size of the second file over and over again -- it's of course just 914,46 MiB :-))
Comment 9 Brion Vibber 2011-10-25 21:45:05 UTC
Both of those new files come up with sensible metadata and thumbnails when I import them to my local MediaWiki trunk instance.

The originals may simply have been corrupt in the first place...
Comment 10 Tomasz W. Kozlowski 2011-10-25 22:17:38 UTC
Then maybe they might be uploaded to the Commons by Reedy and we'll close the bug as RESOLVED INVALID?
Comment 11 Sam Reed (reedy) 2011-10-25 22:39:48 UTC
Re uploaded now. I'll close it :)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links