Last modified: 2014-07-25 15:01:05 UTC
I've been seeing this in the overnight runs of the browser tests in recent times. The build for VisualEditor will fail with a modal dialog that says "Error loading data from server: readonly. The wiki is currently in read-only mode. Would you like to retry?" Here is an example from the overnight run Sunday 18 May: https://wmf.ci.cloudbees.com/job/VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome/512/testReport/(root)/VisualEditor/Edit_with_strings__outline_example_____Editing_with_%C3%84%C3%8B%C3%8F%C3%96%C3%9C___Editing_with_%C3%84%C3%8B%C3%8F%C3%96%C3%9C___/ I can't think of any reason why beta labs would be in read-only mode late on a Sunday (PDT). I suspect this may also be the cause of the occasional failure in other builds with less information, for example "too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout)" that we see in the MobileFrontend builds: too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout) https://wmf.ci.cloudbees.com/job/MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox/571/testReport/(root)/Check%20UI%20components/Check_existence_of_important_UI_components_on_other_pages_/
I think this is the first time I've seen the wiki read-only during the day (PDT) https://wmf.ci.cloudbees.com/job/VisualEditor-en.wikipedia.beta.wmflabs.org-linux-firefox/4/testReport/(root)/Switching%20between%20wikitext%20and%20Visual%20Editor%20modes/Switch_editing_modes_via_toolbar/
If I recall correctly, this is something that can happen when things go sideways with the database. Not sure if that's what's going on here, but may be worth looking into.
Who could investigate this?
On one SauceLab failure, it was POSTing to "http://en.wikipedia.beta.wmflabs.org/wiki/User:Selenium_user/firefox?vehidebetadialog=true&veaction=edit" The message: The wiki is currently in read-only mode. Would you like to retry? Which comes from ApiBase::dieReadOnly(). That method seems to only be called when wfReadOnly() is true which is some legacy code that would let us create a file on the cluster that would disable edits entirely. There is like 0% change it is being triggered that way unless something mess with $wgReadOnly. So most probably the i18n message is being reused by another path of code.
FWIW VisualEditor doesn't know about <readonlytext> – it's just passing on what it gets from the API.
Right, and we know about the unexpected readonly status because it seems only VisualEditor displays that error in a javascript confirm modal dialog. It might manifest in other ways that we would not see if not for the modal dialog that stops the test.
Created attachment 15558 [details] Screenshot I have reproduced this issue today on Betalabs, attaching the screenshot
Antoine, would these messages be relevant? They do not seem to happen at any particular interval but they might be correlated to the time at which Rummana saw the problem. @deployment-bastion:/data/project/logs$ tail -f dberror.log Tue Jun 3 17:17:09 UTC 2014 deployment-apache01 testwiki Error connecting to 10.68.17.94: :real_connect(): (42000/1049): Unknown database 'testwikidatawiki' Tue Jun 3 17:17:09 UTC 2014 deployment-apache01 testwiki Connection error: No working slave server: Unknown error (10.68.17.94) Tue Jun 3 17:17:09 UTC 2014 deployment-apache01 testwiki Error connecting to 10.68.17.94: :real_connect(): (42000/1049): Unknown database 'testwikidatawiki' Tue Jun 3 17:17:09 UTC 2014 deployment-apache01 testwiki Connection error: No working slave server: Unknown error (10.68.17.94) Tue Jun 3 17:17:09 UTC 2014 deployment-apache01 testwiki Error connecting to 10.68.17.94: :real_connect(): (42000/1049): Unknown database 'testwikidatawiki' Tue Jun 3 17:17:09 UTC 2014 deployment-apache01 testwiki Connection error: No working slave server: Unknown error (10.68.17.94) Tue Jun 3 17:50:48 UTC 2014 deployment-apache01 testwiki Error connecting to 10.68.17.94: :real_connect(): (42000/1049): Unknown database 'testwikidatawiki' Tue Jun 3 17:50:48 UTC 2014 deployment-apache01 testwiki Connection error: No working slave server: Unknown error (10.68.17.94) Tue Jun 3 19:20:48 UTC 2014 deployment-apache01 testwiki Error connecting to 10.68.17.94: :real_connect(): (42000/1049): Unknown database 'testwikidatawiki' Tue Jun 3 19:20:48 UTC 2014 deployment-apache01 testwiki Connection error: No working slave server: Unknown error (10.68.17.94)
I saw this just now also.
Adding Sean Pringle. This seems to be getting worse. I'd like to either update the db less often or else make it less disruptive.
also see https://github.com/wikimedia/operations-mediawiki-config/commit/38990c671fd3b8d15f31a7c819e7bdd52ecef3ef
This seems to have been fixed by https://gerrit.wikimedia.org/r/#/c/149052/ Thanks Sam!