Last modified: 2012-04-23 18:16:58 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36981, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34981 - Rerun populateParentId from some IDs (English Wikipedia)
Rerun populateParentId from some IDs (English Wikipedia)
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: shell
Depends on:
Blocks: 16660 29782
  Show dependency treegraph
 
Reported: 2012-03-05 00:53 UTC by Jarry1250
Modified: 2012-04-23 18:16 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Screenshot of the issue (345.51 KB, image/png)
2012-03-12 01:30 UTC, Krinkle
Details

Description Jarry1250 2012-03-05 00:53:53 UTC
Per my rather long and rambling commentary on bug #34922, the remaining NULLs in the WMF revision table are really beginning to cause problems.

It seems that the script simply didn't work properly the first time, since the NULLs peter out in the morning of 8 April 2008, the same day the script was committed to SVN (r32937).

For reasons not yet explained, the NULLs only start appearing around 01:30UTC 4 October 2007.

Some revisions in that range has 'rev_parent_id's, but not very many.

Therefore I would suggest that the script be run again to cover thus period (give or take, revision IDs 162147000 to 204172300).
Comment 1 Jarry1250 2012-03-05 00:54:38 UTC
Adding as a blocker for bug #34922, since it seems to be required for a full fix.
Comment 2 Jarry1250 2012-03-05 00:56:37 UTC
And remove again, since as Bawolff rightly says, MediaWiki is fixed, it's Wikimedia at fault.
Comment 3 Jarry1250 2012-03-05 14:08:37 UTC
(In reply to comment #0)
> Therefore I would suggest that the script be run again to cover thus period
> (give or take, revision IDs 162147000 to 204172300).

Eurgh, those are en.wp revids of course, I'll have to establish whether this appears on other wikis at a later date. I suspect it probably does.
Comment 4 Jarry1250 2012-03-06 13:57:39 UTC
(In reply to comment #3)
> 
> Eurgh, those are en.wp revids of course, I'll have to establish whether this
> appears on other wikis at a later date. I suspect it probably does.

Nope, doesn't appear to (I just tested Commons, de.wp and fr.wp). So just the English Wikipedia then.
Comment 5 Krinkle 2012-03-12 01:30:59 UTC
Created attachment 10218 [details]
Screenshot of the issue

The attached screenshot taken from [1] shows that edits before 14:27, 2 November 2007 and after 11:53, 27 March 2008 and up until today have the sizes calculated properly. The ones in the middle don't and fallback to a revision total size (no color, and no +/- sign. It's not a wrongly calculated difference size, it just shows the total size of the page at that time).


[1] https://en.wikipedia.org/w/index.php?title=Special:Contributions/Dpmuk&dir=prev
Comment 6 Bawolff (Brian Wolff) 2012-03-12 01:35:14 UTC
(In reply to comment #5)
> Created attachment 10218 [details]
> Screenshot of the issue
> 
> The attached screenshot taken from [1] shows that edits before 14:27, 2
> November 2007 and after 11:53, 27 March 2008 and up until today have the sizes
> calculated properly. The ones in the middle don't and fallback to a revision
> total size (no color, and no +/- sign. It's not a wrongly calculated difference
> size, it just shows the total size of the page at that time).
> 
> 
> [1]
> https://en.wikipedia.org/w/index.php?title=Special:Contributions/Dpmuk&dir=prev

Note if you're commenting/complaining about the fallback behaviour, the fallback behaviour of just "shows the total size of the page at that time" was introduced by me in r112995 and more discussed at bug 34922. It was between doing that, and just showing nothing at all. I'm not sure which is better.
Comment 7 Daniel Money 2012-03-13 03:38:54 UTC
No I wasn't commenting about the fallback behaviour - that makes sense and is quite obvious as the text isn't bold, isn't coloured and doesn't include a + or -.  At the time of filing the bug however it was showing as a diff, i.e. the edit at 14:27, 2 November 2007 was showing (+1646) in green and bold.  Presumably something fixed in bug #34922 solved that problem.
Comment 8 Jarry1250 2012-03-13 08:31:17 UTC
(In reply to comment #7)
> At the time of filing the bug however it was showing as a diff, i.e. the
> edit at 14:27, 2 November 2007 was showing (+1646) in green and bold. 
> Presumably something fixed in bug #34922 solved that problem.

Well, Bawolff put in a fix that isolated those revisions giving bad diff values, and replaced them with page-sizes, hence the resolution of that bug. Of course, we'd actually quite like them as diff values, hence this bug.
Comment 9 Sam Reed (reedy) 2012-04-17 15:47:26 UTC
I've hacked up the script to work between the start and end you suggested, also adding in a condition of where rev_parent_id = null (I think I'll put that into vcs) to further reduce the number of rows read to be checked and updated

Running in a screen session as me on fenari
Comment 10 Jarry1250 2012-04-18 13:17:41 UTC
(In reply to comment #9)
> I've hacked up the script to work between the start and end you suggested, also
> adding in a condition of where rev_parent_id = null (I think I'll put that into
> vcs) to further reduce the number of rows read to be checked and updated
> 
> Running in a screen session as me on fenari

Any luck with this? Or does the script need fixing in some other way?
Comment 11 Sam Reed (reedy) 2012-04-18 13:20:26 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > I've hacked up the script to work between the start and end you suggested, also
> > adding in a condition of where rev_parent_id = null (I think I'll put that into
> > vcs) to further reduce the number of rows read to be checked and updated
> > 
> > Running in a screen session as me on fenari
> 
> Any luck with this? Or does the script need fixing in some other way?

...doing rev_id from 169280200 to 169280399

It's not going to be quick ;)
Comment 12 Jarry1250 2012-04-18 13:30:24 UTC
...Ah. 

So that's ~23% done then in ~22 hours. On that basis, give or take, it's going to take another 3 days, which doesn't sound unreasonable.

Good good :)
Comment 13 Sam Reed (reedy) 2012-04-19 11:55:07 UTC
...doing rev_id from 176812800 to 176812999
Comment 14 Sam Reed (reedy) 2012-04-20 10:23:28 UTC
...doing rev_id from 185014000 to 185014199
Comment 15 Jarry1250 2012-04-20 18:37:54 UTC
Another two days then, give or take.
Comment 16 Jarry1250 2012-04-22 15:42:39 UTC
Seems finished, from what I can tell?
Comment 17 Sam Reed (reedy) 2012-04-23 18:16:58 UTC
rev_parent_id population complete ... 37262590 rows [34817563 changed]

Yup, seems to be

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links