Last modified: 2013-11-23 00:39:38 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59359, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 57359 - Systematic 504 Gateway Time-out on commons with categories containing big TIFF pictures
Systematic 504 Gateway Time-out on commons with categories containing big TIF...
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Media storage (Other open bugs)
wmf-deployment
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-21 17:41 UTC by Kelson [Emmanuel Engelhart]
Modified: 2013-11-23 00:39 UTC (History)
9 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kelson [Emmanuel Engelhart] 2013-11-21 17:41:06 UTC
For example with this category:
https://commons.wikimedia.org/wiki/Category:Media_contributed_by_Zentralbibliothek_Z%C3%BCrich_%28original_picture%29
Comment 1 Kelson [Emmanuel Engelhart] 2013-11-21 17:42:31 UTC
This was not the case a few weeks ago, so something has changed (in the wrong way) in the web server/proxy configuration or in MW code base.
Comment 2 Andre Klapper 2013-11-21 21:00:20 UTC
Thanks for taking the time to report this!

Confirming.
Comment 4 Nemo 2013-11-21 21:45:43 UTC
Not even {{filepath}} works with that file, what's the URL to the original?
Comment 5 Bawolff (Brian Wolff) 2013-11-21 21:47:10 UTC
It has an img_metadata field of:

a:1:{s:6:"errors";a:1:{i:0;s:85:"tiffinfo command failed: '/usr/bin/tiffinfo' '/tmp/localcopy_2bbfcd346e5d-1.tif' 2>&1";}}

This would fail the isMetadataValid test for:

                if ( !isset( $metadata['TIFF_METADATA_VERSION'] ) ) {
                        return false;
                }

So presumably, PagedTiffHandler tries to re-extract the metadata on every request, which is probably hanging.
Comment 6 Bawolff (Brian Wolff) 2013-11-21 21:50:00 UTC
(In reply to comment #4)
> Not even {{filepath}} works with that file, what's the URL to the original?

bawolff@Bawolff-L:/var/www/w/extensions/PagedTiffHandler$ echo -n Zentralbibliothek_Zürich_-_Heinrich_Bullingers_Westerhemd_-_000012135.tif | md5sum
fa1ecb93ed05e8902d3b69a97d726207 

So that would make:

https://upload.wikimedia.org/wikipedia/commons/f/fa/Zentralbibliothek_Zürich_-_Heinrich_Bullingers_Westerhemd_-_000012135.tif
Comment 7 Aaron Schulz 2013-11-21 22:03:43 UTC
I was looking at this lately (due to temp files filling up /tmp). That file hangs on my own computer too trying to import it into mediawiki:

[01:22:17] <AaronSchulz>	 read(8, "identify: Memory allocation fail"..., 8192) = 100
[01:22:19] <AaronSchulz>	 wait4(30506, 0x7fff5e7000d4, WNOHANG|WSTOPPED, NULL) = 0
[01:22:20] <AaronSchulz>	 select(12, [8 11], [], [], NULL
[01:22:30] <AaronSchulz>	 ...and stuck
Comment 8 Gerrit Notification Bot 2013-11-21 22:20:45 UTC
Change 96897 had a related patch set uploaded by Brian Wolff:
Do not repetitively extract metadata of broken tiff files.

https://gerrit.wikimedia.org/r/96897
Comment 9 Nemo 2013-11-21 22:34:41 UTC
FYI, https://gerrit.wikimedia.org/r/#/c/29913/ : but it would be too simple if it was just that. :)

If I issue tiffinfo on the file, I get 700 MB worth of an endless repetition of 0x81,0xff,0x81,0xff,0x81,0xff etc. I hope this is not legit even for such a crazy format as TIFF?

$ /usr/bin/time -v tiffinfo Zentralbibliothek_Zürich_-_Heinrich_Bullingers_Westerhemd_-_000012135.tif | wc
TIFFReadDirectory: Warning, Zentralbibliothek_Zürich_-_Heinrich_Bullingers_Westerhemd_-_000012135.tif: wrong data type 7 for "RichTIFFIPTC"; tag ignored.
TIFFReadDirectory: Warning, Zentralbibliothek_Zürich_-_Heinrich_Bullingers_Westerhemd_-_000012135.tif: unknown field with tag 37724 (0x935c) encountered.
        Command being timed: "tiffinfo Zentralbibliothek_Zürich_-_Heinrich_Bullingers_Westerhemd_-_000012135.tif"
        User time (seconds): 16.02
        System time (seconds): 1.13
        Percent of CPU this job got: 52%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.72
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 1646320
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 1
        Minor (reclaiming a frame) page faults: 103068
        Voluntary context switches: 811
        Involuntary context switches: 172460
        Swaps: 0
        File system inputs: 274552
        File system outputs: 8
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
    250     264 699801833
Comment 10 Bawolff (Brian Wolff) 2013-11-21 22:39:11 UTC
(In reply to comment #8)
> Change 96897 had a related patch set uploaded by Brian Wolff:
> Do not repetitively extract metadata of broken tiff files.
> 
> https://gerrit.wikimedia.org/r/96897

This is just a patch to make it so stuff doesn't get bogged down extracting data every request, that will ultimately fail (Which we do for most other formats). We should still figure out what's going on here, separate from this patch.
Comment 11 Nemo 2013-11-21 22:41:52 UTC
Note, time is not particularly reliable for memory, but top reports 500+ MB VIRT and about 300 RES. So I guess it's killed for that reason?
Comment 12 Gerrit Notification Bot 2013-11-21 22:44:15 UTC
Change 96897 merged by jenkins-bot:
Do not repetitively extract metadata of broken tiff files.

https://gerrit.wikimedia.org/r/96897
Comment 13 Kelson [Emmanuel Engelhart] 2013-11-22 12:26:02 UTC
Page https://commons.wikimedia.org/wiki/Category:Media_contributed_by_Zentralbibliothek_Z%C3%BCrich_%28original_picture%29 is again available. Therefore, the most critical aspect of this bug is IMO fixed. Thank you.
Comment 14 Aaron Schulz 2013-11-23 00:39:38 UTC
With the above patch, new uploads may still hit a delay or timeouts due to a libtiff bug for some files, though views of them will be fast. Categories and pages using them will also be fast now. ?action=purge  will still be slow for effected file description pages.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links