Last modified: 2013-04-24 14:14:37 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T47669, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 45669 - GeSHi uses a highly recursive regex for number highlighting
GeSHi uses a highly recursive regex for number highlighting
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
SyntaxHighlight (GeSHi) (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: code-update-regression, upstream
: 29677 39498 45953 46753 47026 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-03-03 18:34 UTC by Robert Rohde
Modified: 2013-04-24 14:14 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Robert Rohde 2013-03-03 18:34:53 UTC
The page Module:Convertdata, i.e. data intended to be used by Module:Convert, recently crossed 200 kB.  After doing so, it appears the Module page truncates it and only displays a small fraction of the content when view as a reader.

Compare:

Truncated revision is displayed:
http://en.wikipedia.org/w/index.php?title=Module:Convertdata&oldid=541459412

Much longer Source for that revision:
http://en.wikipedia.org/w/index.php?title=Module:Convertdata&action=edit&oldid=541459412

Prior revision, showing full module:
http://en.wikipedia.org/w/index.php?title=Module:Convertdata&direction=prev&oldid=541459412

Is this truncation necessary?  If it is technically necessary for some reason, then I would suggest that there should at least be some message warning users about the truncation.  If it is not necessary, then it should be fixed to show the whole page.

Also, I would note that the truncation is rather strange, since it isn't a straight truncation, rather a large block in the middle was removed leaving parts of the beginning and the end.  Perhaps this is related to the syntax highlighting being unhappy with very large pages for some reason?
Comment 1 Brad Jorsch 2013-03-04 00:34:03 UTC
This seems to be due to Gerrit change #49985, which also explains why it suddenly screwed up upon crossing 200K.

For number highlighting, GeSHi uses a regex that includes "(?!(?:<DOT>|(?>[^\<]))+>)". If there is too long of a run in the text without anything being highlighted other than numbers, this can easily exceed the pcre recursion limit (which is currently set very low on WMF wikis, see bug 36839 for a similar issue) and causes GeSHi to lose the entire chunk.

Possible fixes include changing that regex (defined on geshi/geshi.php line 2135) to "(?!(?:<DOT>|(?>[^\<]+))+>)" which is much less likely to hit the recursion limit or disabling number highlighting along with string highlighting.
Comment 2 Brad Jorsch 2013-04-01 13:34:22 UTC
*** Bug 46753 has been marked as a duplicate of this bug. ***
Comment 3 Brad Jorsch 2013-04-08 23:48:07 UTC
*** Bug 47026 has been marked as a duplicate of this bug. ***
Comment 4 Brad Jorsch 2013-04-09 15:57:48 UTC
Bug filed upstream at https://sourceforge.net/p/geshi/bugs/223/
Comment 5 Gerrit Notification Bot 2013-04-09 15:59:19 UTC
Related URL: https://gerrit.wikimedia.org/r/58306 (Gerrit Change I27203c767d1d3f2f0999b1b1d8a06e8cf68c19ed)
Comment 6 Gerrit Notification Bot 2013-04-09 15:59:23 UTC
Related URL: https://gerrit.wikimedia.org/r/58306 (Gerrit Change I27203c767d1d3f2f0999b1b1d8a06e8cf68c19ed)
Comment 7 Brad Jorsch 2013-04-09 16:00:49 UTC
*** Bug 39498 has been marked as a duplicate of this bug. ***
Comment 8 Brad Jorsch 2013-04-09 16:00:59 UTC
*** Bug 45953 has been marked as a duplicate of this bug. ***
Comment 9 Michael M. 2013-04-10 08:27:59 UTC
*** Bug 29677 has been marked as a duplicate of this bug. ***
Comment 10 Gerrit Notification Bot 2013-04-24 06:16:33 UTC
https://gerrit.wikimedia.org/r/58306 (Gerrit Change I27203c767d1d3f2f0999b1b1d8a06e8cf68c19ed) | change APPROVED and MERGED [by Tim Starling]
Comment 11 Brad Jorsch 2013-04-24 14:14:37 UTC
Change merged. Note the fix should be deployed on WMF wikis with 1.22wmf3; see https://www.mediawiki.org/wiki/MediaWiki_1.22/Roadmap for the schedule.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links