Last modified: 2014-07-02 09:20:45 UTC
Consider the following snippet of an XML dump created using Special:Export: <mediawiki ...> ... <page> <title>Abcde</title> <ns>0</ns> <id>27</id> <redirect title="Fghij"/> <revision> <id>111</id> <timestamp>2014-05-14T10:27:10Z</timestamp> ... During import, the XML is parsed in WikiImporter::handlePage(). For all tags directly in <page> (like title, ns, id, ...) the info stored in the $pageInfo array is the node content ("Abcde", "0", "27" for the tags above). However, since <redirect is an empty tag, the value in $pageInfo is always an empty string (""). The actual information is stored in the title attribute though. When accessing the $pageInfo array in hooks (e.g. ImportHandlePageXMLTag), the redirect title is not accessible, since it's not correctly parsed. I will submit a fix on Gerrit and post the link here.
Here's my proposed fix: https://gerrit.wikimedia.org/r/134079
Change 134079 had a related patch set uploaded by TTO: Correctly parse 'redirect' XML tag during Special:Import. https://gerrit.wikimedia.org/r/134079
Change 134079 merged by jenkins-bot: Correctly parse 'redirect' XML tag during Special:Import. https://gerrit.wikimedia.org/r/134079