Last modified: 2014-09-23 23:53:20 UTC
Created attachment 5453 [details] list-authors patch When generating current dumps dumpBackup should provide an option to include the authors of each page. The above patch fixes this by doing: *Add a --list-authors, interface to set WikiExporter::list_authors *Remove double call to outputPageStream() if $list_authors was requested (would always fail with an invalid fetchObject call on non-object). *Remove restriction so $list_authors can be used on allPages() *And so fix the corresponding SQL query at do_list_authors() *Class prefix $fname on do_list_authors *Prettify the <contributors> tree output (indentation, new lines) and use the friendlier writeContributor() *XmlDumpWriter::writeContributor is now static PS: Is it safe to CamelCase $list_authors and do_list_authors() or are they expected to be used outside and should be kept like that for backwards compatibility ?
Created attachment 5467 [details] Improved list-authors patch Improving previous patch. Class refactoring. *Remove the evil WikiExporter member $author_list. *Changed do_list_authors parameters, so also CamelCasing *Moved the xml generation to XmlDumpWriter *Documented writeUpload() *Updated $fname of XmlDumpWriter::writeLogItem() *Added RELEASE-NOTES entry
Created attachment 5468 [details] Improved list-authors patch 2 Bugfix: The result set must be freed before doing the listAuthors query.
Is this query efficient or safe to do? Tomasz, can you take a peek and see if this is something that would make sense for future dumps? Desireable, any issues with generation, etc?
Also note Bug 18169: $wgExportAllowListContributors and related unused code should be removed
Adding Ariel to the cc list in hopes of getting review for Ángel González's patch and perhaps some guidance for any necessary revisions.
I doubt we would use this option on the dumps we generate; we already provide full history dumps which include the names of all contributors to each page. And the query is going to be pretty expensive for allPages on our larger wikis. But that's not to say other sites wouldn't find it useful. A quick grep of extensions and core in trunk shows that do_list_authors and list_authors are only used in Export.php and Special:Export.php, not in any extensions, so camel-case to your heart's content. While you're in there... Note that we sometimes have a revision in there with revision id = 0 for a valid user name (because sometimes errors creep in). That means we could see a user name in the list twice, it would be nice to fix that. The patch needs to be updated for trunk, and then I'd like to look at it again.