Last modified: 2013-08-15 21:00:54 UTC
When writing a single comment on a line this line is correctly ignored. When writing two comments on a single line this line is not ignored but interpreted as a blank line. See this page for an example that illustrates the issue: http://en.wikipedia.org/wiki/User:Patrick87/comments It's not a big problem and there should be only few cases when one actually writes two separate comments on a single line, however formatting shouldn't change depending on if there are only one or two comments on the line.
Another test case: *a <!-- x --> *b <!-- x --> <!-- y --> <!-- z --> *c The PHP parser treats 'a' and 'b' as part of the same list, but item 'c' is treated as a completely different list. There are other examples of this sort in the parserTests. It's becoming a source of diffs between PHP and Parsoid.
Change 77988 had a related patch set uploaded by Cscott: Preprocessor: Don't treat a line containing multiple comments as a blank line. https://gerrit.wikimedia.org/r/77988
Change 78248 had a related patch set uploaded by Cscott: Add '-m' option to dumpGrepper; add patterns for bug 41756. https://gerrit.wikimedia.org/r/78248
Change 78248 merged by jenkins-bot: Add '-m' option to dumpGrepper; add patterns for bug 41756. https://gerrit.wikimedia.org/r/78248
subbu notes that parsoid accepts both tabs and spaces surrounding the comments. PHP accepts only spaces. Is it worth tweaking my patch to allow PHP to accept tabs as well? I don't think it will make any/much difference to content, but it would be nice to converge the parsers.
I've grepped through the 20130708 enwiki dump looking to see how many pages this change would affect. I found only 414 pages in the article namespace that are affected -- I put the full list at http://en.wikipedia.org/wiki/User:Cscott/bug41756 There are an additional 1,913 articles in the File: Wikimedia: or Portal: namespace which have lines with more than one space-separated comment. These appear to be mostly bot-generated and mostly harmless. I've put this list on the above page as well.
Change 77988 merged by jenkins-bot: Preprocessor: Don't treat a line containing multiple comments as a blank line. https://gerrit.wikimedia.org/r/77988
Verified fixed in beta and test.