Last modified: 2011-08-04 22:21:36 UTC
There's a small error in line '/(.) (?=\\?|:|;|!|%|\\302\\273)/' => '\\1 \\2', in file http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/parser/Parser.php?view=markup $2 (or \2 respectively) is not defined here, because zero-width look-ahead assertions (?=...) are non-capturing patterns. I suggest one of the following solutions '/(.) (\\?|:|;|!|%|\\302\\273)/' => '\\1 \\2', '/(.) (?=\\?|:|;|!|%|\\302\\273)/' => '\\1 ', '/(?<=.) (?=\\?|:|;|!|%|\\302\\273)/' => ' ', I don't know which of those is the best. I guess the first one is the slowest. Btw. for uniformity you could modify the lines '/(\\302\\253) /' => '\\1 ', '/ (!\s*important)/' => ' \\1', #Beware of CSS magic word !important, bug #11874. analogously.
1. Oh, &nbsp; is parsed here in bugzilla..., sorry I did not know that. 2. and additional to that you could do a little speed-up by grouping the singe-char alternatives by char classes. So the possible solutions would be '/(.) ([?:;!%]|\\302\\273)/' => '\\1&nbsp;\\2', or '/(.) (?=[?:;!%]|\\302\\273)/' => '\\1&nbsp;', or '/(?<=.) (?=[?:;!%]|\\302\\273)/' => '&nbsp;',
Oops, sorry again, is not parsed, it was just a copy&paste error. Third try: A solution would be '/(.) ([?:;!%]|\\302\\273)/' => '\\1 \\2', or '/(.) (?=[?:;!%]|\\302\\273)/' => '\\1 ', or '/(?<=.) (?=[?:;!%]|\\302\\273)/' => ' ',
Created attachment 8886 [details] simple patch that applies the suggested change Attached patch is incredibly simple and removes the redundant \\2. I checked the rest of Parser.php, there are no other incidences of this bug. This changes no behaviour whatsoever. I tested it anyway and all parsertests still pass.
Fixed in r93925.