Last modified: 2014-10-28 09:43:17 UTC
Currently, ContentHandler::exportTransform may modified page content on the fly when generating XML dumps. However, we are currently now re-calculating the SHA1 hash for the revision, causing it to be inconsistent with the transformed content. The XML dump should contain sha1 checksums that are correct for the text in the dump, even if that differs from the raw contents of the database.
Change 168587 had a related patch set uploaded by Daniel Kinzler: Re-caclulate SHA1 after applying exportTransform https://gerrit.wikimedia.org/r/168587
In my opinion the sha1 should be the same as in rev_sha1 and which is also outputted by the api. Maybe add a new hash as explicit checksum. But sha in output should be the same as in the api.