Last modified: 2014-03-20 13:13:35 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T63319, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 61319 - Sudden reversion to old version of page ("lastrevid" != "revid")
Sudden reversion to old version of page ("lastrevid" != "revid")
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
wmf-deployment
All All
: High major (vote)
: ---
Assigned To: Sean Pringle
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-02-13 16:26 UTC by Maggie Dennis
Modified: 2014-03-20 13:13 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Maggie Dennis 2014-02-13 16:26:29 UTC
On Chrome Windows 7, I noted twice today that ANI on English Wikipdia was not displaying current threads, but was displaying threads three days old. Refreshing did not make a difference, although occasionally it would display the current page.

The issue evidently struck others as well, as a user who tried to add a new section wound up adding it to the old material instead of the current page: https://en.wikipedia.org/w/index.php?title=Wikipedia%3AAdministrators%27_noticeboard%2FIncidents&diff=595313330&oldid=595312917

This seems to have happened again, here:
https://en.wikipedia.org/w/index.php?title=Wikipedia%3AAdministrators%27_noticeboard%2FIncidents&diff=595314104&oldid=595313971

On the IRC English Wikipedia admin's channel, an admin noted that every time he tried to load it, the page was showing a post from 21:53, 10 February 2014 (UTC) at the very bottom. 

When I communicated with an editor about the issue here - https://en.wikipedia.org/w/index.php?title=User_talk:NE_Ent&oldid=595314919 - I noticed that another editor had been impacted in the section above.
Comment 1 Andre Klapper 2014-02-13 19:10:52 UTC
(In reply to Maggie Dennis from comment #0)
> On Chrome Windows 7, I noted twice today that ANI on English Wikipdia was
> not displaying current threads, but was displaying threads three days old.
> Refreshing did not make a difference, although occasionally it would display
> the current page.

Did somebody try purging ([[WP:Purge]])?

Is this really a "reversion" in the sense of reverting changes, or maybe just an old version being delivered and displayed for some people (caching issues)?
Comment 2 Maggie Dennis 2014-02-13 21:14:18 UTC
Oh, yes. People have tried purging repeatedly. The old version is delivered and displayed erratically - sometimes I am seeing the current version and other times the old. If you look at the history, you can see that it is still happening - https://en.wikipedia.org/w/index.php?title=Wikipedia:Administrators%27_noticeboard/Incidents&action=history

There is also discussion at Village Pump/Technical, although i don't know if it will help:
https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Discussions_disappearing_and_reappearing

Coren speculated earlier that it might be related to "new section", and when you look at the history of ANI there does seem to be something to that. 

However, when it happened to me just a few minutes ago (https://en.wikipedia.org/w/index.php?title=Wikipedia%3AAdministrators%27_noticeboard%2FIncidents&diff=595354506&oldid=595351438) I had intended to edit the last section only. I can't be sure I did, because I wiped out the pre-built edit summary. But I was aware that the problem might be related to this and intended to avoid it, anyway.
Comment 3 Jesús Martínez Novo (Ciencia Al Poder) 2014-02-13 21:30:17 UTC
An hour ago there was a problem with esams cluster, similar to what happened on bug 54647 and then tracked on bug 56545, so the problem seems to be the same: cluster fails and we're getting cached pages, which is better than having no pages at all
Comment 4 Brad Jorsch 2014-02-14 01:01:45 UTC
This is not good.

API query: https://en.wikipedia.org/w/api.php?action=query&prop=info|revisions&rvlimit=1&format=jsonfm&pageids=2535910&servedby=1

 {
     "query-continue": {
         "revisions": {
             "rvcontinue": 595381322
         }
     },
     "servedby": "mw1192",
     "query": {
         "pages": {
             "2535910": {
                 "pageid": 2535910,
                 "ns": 4,
                 "title": "Wikipedia:Reference desk/Science",
                 "contentmodel": "wikitext",
                 "pagelanguage": "en",
                 "touched": "2014-02-14T00:41:03Z",
                 "lastrevid": 595381347,
                 "counter": "",
                 "length": 112194,
                 "revisions": [
                     {
                         "revid": 595381347,
                         "parentid": 595381322,
                         "minor": "",
                         "user": "SineBot",
                         "timestamp": "2014-02-14T00:41:03Z",
                         "comment": "Signing comment by [[Special:Contributions/68.41.73.11|68.41.73.11]] - \"/* Freezing point? */ new section\""
                     }
                 ]
             }
         }
     }
 }

The "lastrevid" field and the "revid" field in revisions should be the same. I suspect that some of the slave DBs are somehow screwed up and haven't gotten the page_latest field updared
Comment 5 Brad Jorsch 2014-02-14 01:02:43 UTC
Oops, pasted the wrong copy.

 {
     "query-continue": {
         "revisions": {
             "rvcontinue": 595381322
         }
     },
     "servedby": "mw1205",
     "query": {
         "pages": {
             "2535910": {
                 "pageid": 2535910,
                 "ns": 4,
                 "title": "Wikipedia:Reference desk/Science",
                 "contentmodel": "wikitext",
                 "pagelanguage": "en",
                 "touched": "2014-02-14T00:41:03Z",
                 "lastrevid": 594888322,
                 "counter": "",
                 "length": 80791,
                 "revisions": [
                     {
                         "revid": 595381347,
                         "parentid": 595381322,
                         "minor": "",
                         "user": "SineBot",
                         "timestamp": "2014-02-14T00:41:03Z",
                         "comment": "Signing comment by [[Special:Contributions/68.41.73.11|68.41.73.11]] - \"/* Freezing point? */ new section\""
                     }
                 ]
             }
         }
     }
 }
Comment 6 Brad Jorsch 2014-02-14 02:16:02 UTC
More data:

 anomie@terbium:/usr/local/apache/common-local$ for db in 'db1055' 'db1043' 'db1037' 'db1049' 'db1051' 'db1056'; do echo $db; echo -e 'select page_latest from page where page_id=2535910;' | mwscript sql.php --wiki=enwiki --slave=$db; done
 db1055
 stdClass Object
 (
     [page_latest] => 595381347
 )
 db1043
 stdClass Object
 (
     [page_latest] => 595381347
 )
 db1037
 stdClass Object
 (
     [page_latest] => 595381347
 )
 db1049
 stdClass Object
 (
     [page_latest] => 595381347
 )
 db1051
 stdClass Object
 (
     [page_latest] => 595381347
 )
 db1056
 stdClass Object
 (
     [page_latest] => 594888322
 )

So db1056 seems out of sync somehow.
Comment 7 Gerrit Notification Bot 2014-02-14 02:32:31 UTC
Change 113322 had a related patch set uploaded by Springle:
depol db1056 for pt-table-sync checks bug 61319

https://gerrit.wikimedia.org/r/113322
Comment 8 Gerrit Notification Bot 2014-02-14 02:33:01 UTC
Change 113322 merged by jenkins-bot:
depol db1056 for pt-table-sync checks bug 61319

https://gerrit.wikimedia.org/r/113322
Comment 9 Sean Pringle 2014-02-14 02:41:50 UTC
db1056 has been depooled for a sync check, and the remaining slaves will get the same treatment in rotation jic.

db1056 was demoted from master a couple weeks ago, backed up and then eventually rebuilt from another unpooled s1 slave, db1050. It's possible the original problem lies on that box.
Comment 10 Andre Klapper 2014-02-25 15:47:14 UTC
Sean: Anything left to do / investigate here or can this be closed as FIXED?
Comment 11 Andre Klapper 2014-03-20 11:47:10 UTC
Sean: Anything left to do / investigate here or can this be closed as FIXED?

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links