Last modified: 2014-11-03 02:43:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T71384, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 69384 - extract_templates_and_params parser bugs loading w:en:Main_Page with mwparserfromhell
extract_templates_and_params parser bugs loading w:en:Main_Page with mwparser...
Status: NEW
Product: Pywikibot
Classification: Unclassified
textlib.py (Other open bugs)
core-(2.0)
All All
: Unprioritized normal
: ---
Assigned To: Pywikipedia bugs
: upstream
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-08-11 05:40 UTC by John Mark Vandenberg
Modified: 2014-11-03 02:43 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description John Mark Vandenberg 2014-08-11 05:40:38 UTC
Calling extract_templates_and_params with 'use_mwparserfromhell' enabled on the English Wikipedia Main Page results in many 'resuls' which are not a template

i.e.
$ PYTHONPATH="." python -c "import pywikibot; pywikibot.config.use_mwparserfromhell=True; print pywikibot.extract_templates_and_params(pywikibot.Page(pywikibot.Site('en', 'wikipedia'), 'Main Page').text)"

produces:

[(u'NUMBEROFARTICLES', {}), (u'#if:{{Main Page banner}}', {u'1': u'\n<table id="mp-banner" style="width: 100%; margin:4px 0 0 0; background:none; border-spacing: 0px;">\n<tr><td class="MainPageBG" style="padding:2px 8px; background-color:#fffaf5; border:1px solid #f2e0ce; color:#000; font-size:100%;">{{Main Page banner}}\n</td></tr>\n</table>\n'}), (u'Main Page banner', {}), (u'Main Page banner', {}), (u"#ifexpr:{{formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}|R}}>150", {u'1': u"From today's featured article", u'2': u'Featured article <span style="font-size:85%; font-weight:normal;">(Check back later for today\'s.)</span>'}), (u"formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}", {u'1': u'R'}), (u"PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u"#ifexpr:{{formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}|R}}>150", {u'1': u"{{Wikipedia:Today's featured article/{{#time:F j, Y}}}}", u'2': u"{{Wikipedia:Today's featured article/{{#time:F j, Y|-1 day}}}}"}), (u"formatnum:{{PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}}}", {u'1': u'R'}), (u"PAGESIZE:Wikipedia:Today's featured article/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u"Wikipedia:Today's featured article/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u"Wikipedia:Today's featured article/{{#time:F j, Y|-1 day}}", {}), (u'#time:F j, Y', {u'1': u'-1 day'}), (u'Did you know', {}), (u'In the news', {}), (u'Wikipedia:Selected anniversaries/{{#time:F j}}', {}), (u'#time:F j', {}), (u'#switch:{{CURRENTDAYNAME}}', {u'1': u'Monday', u'2': u'', u'Friday': u'\n<table id="mp-middle" style="width:100%; margin:4px 0 0 0; background:none; border-spacing: 0px;">\n<tr>\n<td class="MainPageBG" style="width:100%; border:1px solid #f2cedd; background:#fff5fa; vertical-align:top; color:#000;">\n<table id="mp-center" style="width:100%; vertical-align:top; background:#fff5fa; color:#000;">\n<tr>\n<td style="padding:2px;"><h2 id="mp-tfl-h2" style="margin:3px; background:#f2cedd; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #bfa3af; text-align:left; color:#000; padding:0.2em 0.4em">From today\'s featured list</h2></td>\n</tr><tr>\n<td style="color:#000;"><div id="mp-tfl" style="padding:2px 5px;">{{#ifexist:Wikipedia:Today\'s featured list/{{#time:F j, Y}}|{{Wikipedia:Today\'s featured list/{{#time:F j, Y}}}}|{{TFLempty}}}}</div></td>\n</tr>\n</table>\n</td>\n</tr>\n</table>'}), (u'CURRENTDAYNAME', {}), (u"#ifexist:Wikipedia:Today's featured list/{{#time:F j, Y}}", {u'1': u"{{Wikipedia:Today's featured list/{{#time:F j, Y}}}}", u'2': u'{{TFLempty}}'}), (u'#time:F j, Y', {}), (u"Wikipedia:Today's featured list/{{#time:F j, Y}}", {}), (u'#time:F j, Y', {}), (u'TFLempty', {}), (u'#ifexist:Template:POTD protected/{{#time:Y-m-d}}', {u'1': u"Today's featured picture ", u'2': u' Featured picture&ensp;<span style="font-size:85%; font-weight:normal;">(Check back later for today\'s.)</span>'}), (u'#time:Y-m-d', {}), (u'#ifexist:Template:POTD protected/{{#time:Y-m-d}}', {u'1': u'{{POTD protected/{{#time:Y-m-d}}}}', u'2': u'{{POTD protected/{{#time:Y-m-d|-1 day}}}}'}), (u'#time:Y-m-d', {}), (u'POTD protected/{{#time:Y-m-d}}', {}), (u'#time:Y-m-d', {}), (u'POTD protected/{{#time:Y-m-d|-1 day}}', {}), (u'#time:Y-m-d', {u'1': u'-1 day'}), (u'Other areas of Wikipedia', {}), (u"Wikipedia's sister projects", {}), (u'Wikipedia languages', {}), (u'Main Page interwikis', {}), (u'noexternallanglinks', {})]

compare to when 'use_mwparserfromhell' is disabled

$ PYTHONPATH="." python -c "import pywikibot; pywikibot.config.use_mwparserfromhell=False; print repr(pywikibot.extract_templates_and_params(pywikibot.Page(pywikibot.Site('en', 'wikipedia'), 'Main Page').text))"
[(u'NUMBEROFARTICLES', {}), (u'Main Page banner', {}), (u'Did you know', {}), (u'In the news', {}), (u'CURRENTDAYNAME', {}), (u'TFLempty', {}), (u'Other areas of Wikipedia', {}), (u"Wikipedia's sister projects", {}), (u'Wikipedia languages', {}), (u'Main Page interwikis', {}), (u'noexternallanglinks', {})]
Comment 1 John Mark Vandenberg 2014-08-11 07:21:21 UTC
Note that Page.botMayEdit() uses this method via Page.templatesWithParams() to look for {{nobots}}, and needs to catch an exception when it tries to instantiate a Link using these invalid 'template' names.

See comment in 

https://git.wikimedia.org/blobdiff/pywikibot%2Fcore.git/7e3772cae04f95cb55b223a198fb6350f73b0639/pywikibot%2Fpage.py

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links