Last modified: 2014-11-05 17:44:28 UTC
Jackmcbarn discovered that abuse of the Cite extension in combination with mw.text.unstrip can allow for cross-invoke communication. The general idea is that data can be set by processing a <ref> tag with an otherwise-unused group, and then retrieved later by processing <references> for that group and parsing the HTML. Possible fixes: 1. Remove mw.text.unstrip. Disadvantage: This is something that was requested relatively frequently by the community to deal with <nowiki> tags, and removing it would likely break various modules and cause many complaints. 2. Create a blacklist or whitelist of strip tags for mw.text.unstrip, and use it to disallow "references". Disadvantage: It's a blacklist/whitelist, that has to be maintained somehow. 3. Adjust Cite to not include the HTML in the references strip tag, instead put some token that gets replaced in the ParserAfterParse hook. Disadvantage: Requires Cite to do something unusual because of Scribunto. 4. Rewrite Cite entirely like Gabriel Wicke wants (see comments on Gerrit change #99792), so it basically reparses the whole page in one of the post-parse hooks to handle <ref> and <references>. Disadvantage: It adds an extra pass to the parser (if not a whole extra parser bolted on), and probably won't interact well with other extensions. Of these, #3 seems the least bad to me. But maybe someone else has a better idea.
If I had to order your possible fixes from least bad to worst, I would order them as 2 -> 3 -> 4 -> 1. I can't see any (black|white)list from needing a lot of maintaining, so it is likely the best of the options presented. 1 is obviously the worst as it has the potential of breaking lots of modules dependent on it.
Here's my thoughts on the ways to fix it: 1. I personally am not opposed to this, but I know others are and it would lead to drama. Maybe as a compromise, nowiki could be unstripped while leaving other things alone (we already have code to do this). 2. Seems like a terrible hack. I'm totally against this. 3, 4. I'm equally okay with either of these, with an exception (see below). Overall opinion: 4 = 3 > 1 > 2 The other thing to note is that I found this bug while investigating bug #61248, and I think that method 3 or 4 could potentially solve both of these at once (and there's a chance one of them could solve bug #25417, in which case I'd definitely prefer that one, killing 3 birds with one stone).
Note that with Parsoid's implementation of Cite, neither this bug nor either of the others I linked above occur.
Change 171290 had a related patch set uploaded by Anomie: Add mw.text.unstripNoWiki, mw.text.killMarkers, fix mw.text.unstrip https://gerrit.wikimedia.org/r/171290