Last modified: 2011-08-08 19:12:19 UTC
Starting today I've been seeing this in my bot logs HTTPError: 502 Bad Gateway WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server is down. Retrying in 1 minutes... This morning there were so many werrors that one of my bots (commons:user:QICbot) hit the retry limit and died mid-run. It is currently rerunning, but with errors on every 5th request approximately.
less frequently I do get: HTTPError: 504 Gateway Time-out WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server is down. Retrying in 1 minutes... Also it seems the errors only hit me when doing a page.put() not during a page.get()
Using the API by JavaScript (JSON), I randomly get Server error 504 and less frequently 502. 502 on action=edit 504 on both action=edit, action=query&prop=imageinfo|info|revisions|categories
I wonder if this has any relation (or vice versa) to the slow uploads that have been noticed...
A user reported "Tried with commonist tool and I got the following message: could not upload (requirement failed: unexpected response: HTTP/1.0 502 Bad Gateway)." at http://commons.wikimedia.org/wiki/Commons:Prototype_upload_wizard_feedback#Error_unknown And, of course, the upload wizard also has strange upload errors (see the link).
And he is not the only one... http://commons.wikimedia.org/wiki/Commons:Forum#unexpected_response:_HTTP.2F1.0_502_Bad_Gateway Next time I try to find out whether there are some details in the HTML - error-output.
For the 504-error: Request: POST http://commons.wikimedia.org/w/api.php, from 91.198.174.40 via sq34.wikimedia.org (squid/2.7.STABLE9) to () Error: ERR_CANNOT_FORWARD, errno [No Error] at Thu, 04 Aug 2011 13:44:18 GMT POST: action=query&prop=imageinfo%7Cinfo%7Crevisions%7Ccategories&rvprop=timestamp%7Ccontent&intoken=&iiprop=url%7Csize&iiurlwidth=120&iiurlheight=120&titles=File%3AWikipedia.tamil.path.svg&clprop=hidden&cllimit=25&format=json Response-Header Server squid/2.7.STABLE9 Date Thu, 04 Aug 2011 13:44:18 GMT Content-Type text/html Content-Length 3003 X-Squid-Error ERR_CANNOT_FORWARD 0 X-Cache MISS from sq34.wikimedia.org, MISS from knsq30.knams.wikimedia.org, MISS from amssq37.esams.wikimedia.org X-Cache-Lookup MISS from sq34.wikimedia.org:3128, MISS from knsq30.knams.wikimedia.org:3128, MISS from amssq37.esams.wikimedia.org:80 Connection close Request-Header Host commons.wikimedia.org User-Agent Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0 Accept application/json, text/javascript, */* Accept-Language de Accept-Encoding gzip, deflate Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 Connection keep-alive Content-Type application/x-www-form-urlencoded; charset=UTF-8 X-Requested-With XMLHttpRequest Referer http://commons.wikimedia.org/wiki/Commons_talk:Tools/Commonist Content-Length 220 Cookie centralauth_User=Rillke; centralauth_Session=xxx; commonswiki_session=xxx; dismissSiteNotice=2.38; popTz=0; vector-nav-p-tb=true
For the 502-error: Request: POST http://commons.wikimedia.org/w/api.php, from 77.184.171.69 via amssq32.esams.wikimedia.org (squid/2.7.STABLE9) to 91.198.174.40 (91.198.174.40) Error: ERR_READ_ERROR, errno (104) Connection reset by peer at Thu, 04 Aug 2011 15:06:59 GMT action=query&list=logevents&leprop=title%7Ctype%7Ctimestamp&letype=upload&leuser=Rillke&lelimit=50&lestart=2011-06-03T15%3A02%3A31Z&format=json Respons-Header Server squid/2.7.STABLE9 Date Thu, 04 Aug 2011 15:06:59 GMT Content-Type text/html Content-Length 3054 X-Squid-Error ERR_READ_ERROR 104 X-Cache MISS from amssq32.esams.wikimedia.org X-Cache-Lookup MISS from amssq32.esams.wikimedia.org:80 Connection close Request-Header Host commons.wikimedia.org User-Agent Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0 Accept application/json, text/javascript, */* Accept-Language de Accept-Encoding gzip, deflate Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 Connection keep-alive Content-Type application/x-www-form-urlencoded; charset=UTF-8 X-Requested-With XMLHttpRequest Referer http://commons.wikimedia.org/wiki/User:Rillke/AjaxMassDelete.js Content-Length 143 Cookie centralauth_User=Rillke; centralauth_Session=xxx; commonswiki_session=xxx; dismissSiteNotice=2.38; popTz=0; vector-nav-p-tb=true
It would seem that all of you are Europe based. Do we know if anyone who is elsewhere (ie would be hitting PMTPA) is experiencing the issue?
Reedy: I've tried to replicate this here several times and I can't. Hard to prove a negative though.
(In reply to comment #9) > Reedy: I've tried to replicate this here several times and I can't. Hard to > prove a negative though. Indeed, watching the squid logs there seems to be some here and there. Some of the mentioned machines (in Tampa), have been recently upgraded. I'm not sure if it's co-incidental that these upgrades have happened and errors have started, it could quite well likely be so I have logged an RT ticket and CC'd Peter -http://rt.wikimedia.org/Ticket/Display.html?id=1263
lowering priority since this has moved to Ops.
Ops have made a couple of changes in the last few hours. Can anyone who has been able to reproduce this, test again and find out if it's still happening? Thanks!
Yes is still happening!!
one Foto works but more not.
Yep I can confirm that my bots are still getting plenty of 502 on page.put() in pywikipediabot. Doesnt' look like anything has changed yet.
Yup, using the API with JavaScript throws error 502 and 504 on every 5th request.
How about now (unfortunately we're having issues reproducing it, this is the simplest way)? It seems one of the api apache application servers was very out of sync, and is in the progress of being fixed... But won't be hit atm
Does none of the admins have toolserver account?! Or any account on a european computer? Looks better now. I'll keep testing.
(In reply to comment #18) > Does none of the admins have toolserver account?! Or any account on a european > computer? > Looks better now. I'll keep testing. We have both. I tried numerous requests yesterday with AWB, and encountered no errors.
Nope, sorry. 502 is back. The test before must have been a fluke (was a short bot run that went thorough without errors)
The frequency of 502 errors is down to about 1 in 10 requests failing. Looks like something is improving after all. Can anyone else confirm this?
No, still no improvement for mee.
Side note, I have changed the following messages or pages on Commons to point to Commons:Upload until the issues are resolved. MediaWiki:Sitenotice Commons:Upload - commented out {{UploadWizard}} template MediaWiki:Upload-url MediaWiki:Upload-url/en
still not better: 2 new entries (Bugreports) in http://commons.wikimedia.org/wiki/MediaWiki_talk:AjaxQuickDelete.js
I was able to track down a few examples in the kennisnet udplog stream. knsq23.knams.wikimedia.org 3654410 2011-08-05T21:00:08.774 6676 213.221.6.148 TCP_MISS/502 3330 POST http://uk.wikipedia.org/w/api.php CARP/91.198.174.40 text/html - - PythonWikipediaBot/1.0 knsq23.knams.wikimedia.org 3272304 2011-08-05T20:43:08.138 4681 217.187.129.179 TCP_MISS/502 3432 POST http://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:Xqt&action=submit CARP/91.198.174.40 text/html http://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:Xqt&action=edit§ion=56 - Mozilla/5.0%20(Windows%20NT%206.1;%20WOW64;%20rv:5.0)%20Gecko/20100101%20Firefox/5.0 All POST's, all getting hashed via carp to the backend squid on knsq30. A drive is failing on knsq30 (91.198.174.40) and there are possibly other problems - load is many times hire than all other squids. I removed it from the frontend.conf and since deploying, have not seen any more 502's.
Now I don't get the 502 anymore, but instead timeouts: Uploading file to commons:commons via API.... <urlopen error timed out> WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or your connection is down. Retrying in 1 minutes... <urlopen error timed out> WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or your connection is down. Retrying in 2 minutes... <urlopen error timed out> WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or your connection is down. Retrying in 4 minutes... <urlopen error timed out> WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or your connection is down. Retrying in 8 minutes... <urlopen error timed out> WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server or your connection is down. Retrying in 16 minutes...
Made Special:UploadWizard default uploader again on Commons (see comment #23)(In reply to comment #23) > Side note, I have changed the following messages or pages on Commons to point > to Commons:Upload until the issues are resolved. > > MediaWiki:Sitenotice > Commons:Upload - commented out {{UploadWizard}} template > MediaWiki:Upload-url > MediaWiki:Upload-url/en Switched these back to use UploadWizard.