Last modified: 2011-08-08 19:12:19 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T32201, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 30201 - API requests to commons frequently return 502
API requests to commons frequently return 502
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Sam Reed (reedy)
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-03 16:12 UTC by Daniel Schwen
Modified: 2011-08-08 19:12 UTC (History)
9 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Daniel Schwen 2011-08-03 16:12:50 UTC
Starting today I've been seeing this in my bot logs

HTTPError: 502 Bad Gateway
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'.
Maybe the server is down. Retrying in 1 minutes...

This morning there were so many werrors that one of my bots (commons:user:QICbot) hit the retry limit and died mid-run. It is currently rerunning, but with errors on every 5th request approximately.
Comment 1 Daniel Schwen 2011-08-03 16:20:57 UTC
less frequently I do get:

HTTPError: 504 Gateway Time-out
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'.
Maybe the server is down. Retrying in 1 minutes...

Also it seems the errors only hit me when doing a page.put() not during a page.get()
Comment 2 Rainer Rillke @commons.wikimedia 2011-08-03 16:49:03 UTC
Using the API by JavaScript (JSON), I randomly get Server error 504 and less frequently 502.

502 on action=edit
504 on both action=edit, action=query&prop=imageinfo|info|revisions|categories
Comment 3 Sam Reed (reedy) 2011-08-03 21:09:10 UTC
I wonder if this has any relation (or vice versa) to the slow uploads that have been noticed...
Comment 4 Saibo 2011-08-04 00:09:33 UTC
A user reported "Tried with commonist tool and I got the following message: could not upload (requirement failed: unexpected response: HTTP/1.0 502 Bad Gateway)." at http://commons.wikimedia.org/wiki/Commons:Prototype_upload_wizard_feedback#Error_unknown 

And, of course, the upload wizard also has strange upload errors (see the link).
Comment 5 Rainer Rillke @commons.wikimedia 2011-08-04 10:08:29 UTC
And he is not the only one...

http://commons.wikimedia.org/wiki/Commons:Forum#unexpected_response:_HTTP.2F1.0_502_Bad_Gateway

Next time I try to find out whether there are some details in the HTML - error-output.
Comment 6 Rainer Rillke @commons.wikimedia 2011-08-04 13:48:07 UTC
For the 504-error:

Request: POST http://commons.wikimedia.org/w/api.php, from 91.198.174.40 via sq34.wikimedia.org (squid/2.7.STABLE9) to ()
Error: ERR_CANNOT_FORWARD, errno [No Error] at Thu, 04 Aug 2011 13:44:18 GMT

POST:
action=query&prop=imageinfo%7Cinfo%7Crevisions%7Ccategories&rvprop=timestamp%7Ccontent&intoken=&iiprop=url%7Csize&iiurlwidth=120&iiurlheight=120&titles=File%3AWikipedia.tamil.path.svg&clprop=hidden&cllimit=25&format=json

Response-Header
Server	squid/2.7.STABLE9
Date	Thu, 04 Aug 2011 13:44:18 GMT
Content-Type	text/html
Content-Length	3003
X-Squid-Error	ERR_CANNOT_FORWARD 0
X-Cache	MISS from sq34.wikimedia.org, MISS from knsq30.knams.wikimedia.org, MISS from amssq37.esams.wikimedia.org
X-Cache-Lookup	MISS from sq34.wikimedia.org:3128, MISS from knsq30.knams.wikimedia.org:3128, MISS from amssq37.esams.wikimedia.org:80
Connection	close

Request-Header
Host	commons.wikimedia.org
User-Agent	Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0
Accept	application/json, text/javascript, */*
Accept-Language	de
Accept-Encoding	gzip, deflate
Accept-Charset	ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection	keep-alive
Content-Type	application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With	XMLHttpRequest
Referer	http://commons.wikimedia.org/wiki/Commons_talk:Tools/Commonist
Content-Length	220
Cookie	centralauth_User=Rillke; centralauth_Session=xxx; commonswiki_session=xxx; dismissSiteNotice=2.38; popTz=0; vector-nav-p-tb=true
Comment 7 Rainer Rillke @commons.wikimedia 2011-08-04 15:09:39 UTC
For the 502-error:

Request: POST http://commons.wikimedia.org/w/api.php, from 77.184.171.69 via amssq32.esams.wikimedia.org (squid/2.7.STABLE9) to 91.198.174.40 (91.198.174.40)
Error: ERR_READ_ERROR, errno (104) Connection reset by peer at Thu, 04 Aug 2011 15:06:59 GMT 

action=query&list=logevents&leprop=title%7Ctype%7Ctimestamp&letype=upload&leuser=Rillke&lelimit=50&lestart=2011-06-03T15%3A02%3A31Z&format=json

Respons-Header
Server	squid/2.7.STABLE9
Date	Thu, 04 Aug 2011 15:06:59 GMT
Content-Type	text/html
Content-Length	3054
X-Squid-Error	ERR_READ_ERROR 104
X-Cache	MISS from amssq32.esams.wikimedia.org
X-Cache-Lookup	MISS from amssq32.esams.wikimedia.org:80
Connection	close

Request-Header
Host	commons.wikimedia.org
User-Agent	Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0
Accept	application/json, text/javascript, */*
Accept-Language	de
Accept-Encoding	gzip, deflate
Accept-Charset	ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection	keep-alive
Content-Type	application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With	XMLHttpRequest
Referer	http://commons.wikimedia.org/wiki/User:Rillke/AjaxMassDelete.js
Content-Length	143
Cookie	centralauth_User=Rillke; centralauth_Session=xxx; commonswiki_session=xxx; dismissSiteNotice=2.38; popTz=0; vector-nav-p-tb=true
Comment 8 Sam Reed (reedy) 2011-08-04 15:29:14 UTC
It would seem that all of you are Europe based.

Do we know if anyone who is elsewhere (ie would be hitting PMTPA) is experiencing the issue?
Comment 9 Neil Kandalgaonkar 2011-08-04 22:24:06 UTC
Reedy: I've tried to replicate this here several times and I can't. Hard to prove a negative though.
Comment 10 Sam Reed (reedy) 2011-08-04 22:28:14 UTC
(In reply to comment #9)
> Reedy: I've tried to replicate this here several times and I can't. Hard to
> prove a negative though.

Indeed, watching the squid logs there seems to be some here and there.

Some of the mentioned machines (in Tampa), have been recently upgraded.

I'm not sure if it's co-incidental that these upgrades have happened and errors have started, it could quite well likely be so

I have logged an RT ticket and CC'd Peter -http://rt.wikimedia.org/Ticket/Display.html?id=1263
Comment 11 Mark A. Hershberger 2011-08-05 11:11:40 UTC
lowering priority since this has moved to Ops.
Comment 12 Sam Reed (reedy) 2011-08-05 12:06:35 UTC
Ops have made a couple of changes in the last few hours.

Can anyone who has been able to reproduce this, test again and find out if it's still happening?

Thanks!
Comment 13 Ra Boe 2011-08-05 14:15:06 UTC
Yes is still happening!!
Comment 14 Ra Boe 2011-08-05 14:46:35 UTC
one Foto works but more not.
Comment 15 Daniel Schwen 2011-08-05 14:51:26 UTC
Yep I can confirm that my bots are still getting plenty of 502 on page.put() in pywikipediabot. Doesnt' look like anything has changed yet.
Comment 16 Rainer Rillke @commons.wikimedia 2011-08-05 15:05:24 UTC
Yup, using the API with JavaScript throws error 502 and 504 on every 5th request.
Comment 17 Sam Reed (reedy) 2011-08-05 15:57:09 UTC
How about now (unfortunately we're having issues reproducing it, this is the simplest way)? It seems one of the api apache application servers was very out of sync, and is in the progress of being fixed... But won't be hit atm
Comment 18 Daniel Schwen 2011-08-05 16:08:38 UTC
Does none of the admins have toolserver account?! Or any account on a european computer? 
Looks better now. I'll keep testing.
Comment 19 Sam Reed (reedy) 2011-08-05 16:09:14 UTC
(In reply to comment #18)
> Does none of the admins have toolserver account?! Or any account on a european
> computer? 
> Looks better now. I'll keep testing.

We have both. I tried numerous requests yesterday with AWB, and encountered no errors.
Comment 20 Daniel Schwen 2011-08-05 16:12:44 UTC
Nope, sorry. 502 is back. The test before must have been a fluke (was a short bot run that went thorough without errors)
Comment 21 Daniel Schwen 2011-08-05 16:18:41 UTC
The frequency of 502 errors is down to about 1 in 10 requests failing. Looks
like something is improving after all. 
Can anyone else confirm this?
Comment 22 prolineserver 2011-08-05 16:30:50 UTC
No, still no improvement for mee.
Comment 23 Neil Kandalgaonkar 2011-08-05 17:00:06 UTC
Side note, I have changed the following messages or pages on Commons to point to Commons:Upload until the issues are resolved.

MediaWiki:Sitenotice
Commons:Upload - commented out {{UploadWizard}} template
MediaWiki:Upload-url
MediaWiki:Upload-url/en
Comment 24 Rainer Rillke @commons.wikimedia 2011-08-05 19:13:07 UTC
still not better: 2 new entries (Bugreports) in 

http://commons.wikimedia.org/wiki/MediaWiki_talk:AjaxQuickDelete.js
Comment 25 Asher Feldman 2011-08-05 21:06:11 UTC
I was able to track down a few examples in the kennisnet udplog stream.  

knsq23.knams.wikimedia.org 3654410 2011-08-05T21:00:08.774 6676 213.221.6.148 TCP_MISS/502 3330 POST http://uk.wikipedia.org/w/api.php CARP/91.198.174.40 text/html - - PythonWikipediaBot/1.0

knsq23.knams.wikimedia.org 3272304 2011-08-05T20:43:08.138 4681 217.187.129.179 TCP_MISS/502 3432 POST http://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:Xqt&action=submit CARP/91.198.174.40 text/html http://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:Xqt&action=edit&section=56 - Mozilla/5.0%20(Windows%20NT%206.1;%20WOW64;%20rv:5.0)%20Gecko/20100101%20Firefox/5.0

All POST's, all getting hashed via carp to the backend squid on knsq30. A drive is failing on knsq30 (91.198.174.40) and there are possibly other problems - load is many times hire than all other squids.  I removed it from the frontend.conf and since deploying, have not seen any more 502's.
Comment 26 prolineserver 2011-08-06 05:25:16 UTC
Now I don't get the 502 anymore, but instead timeouts:

Uploading file to commons:commons via API....
<urlopen error timed out>
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or
 your connection is down. Retrying in 1 minutes...
<urlopen error timed out>
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or
 your connection is down. Retrying in 2 minutes...
<urlopen error timed out>
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or
 your connection is down. Retrying in 4 minutes...
<urlopen error timed out>
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or
 your connection is down. Retrying in 8 minutes...
<urlopen error timed out>
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or
 your connection is down. Retrying in 16 minutes...
Comment 27 Neil Kandalgaonkar 2011-08-08 19:12:19 UTC
Made Special:UploadWizard default uploader again on Commons (see comment #23)(In reply to comment #23)
> Side note, I have changed the following messages or pages on Commons to point
> to Commons:Upload until the issues are resolved.
> 
> MediaWiki:Sitenotice
> Commons:Upload - commented out {{UploadWizard}} template
> MediaWiki:Upload-url
> MediaWiki:Upload-url/en

Switched these back to use UploadWizard.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links