Last modified: 2014-07-25 12:19:46 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57272, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55272 - redirectRegex throws type error
redirectRegex throws type error
Status: NEW
Product: Pywikibot
Classification: Unclassified
General (Other open bugs)
compat-(1.0)
All All
: Low normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 04:54 UTC by Kunal Mehta (Legoktm)
Modified: 2014-07-25 12:19 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:54:48 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1201/
Reported by: dnessett
Created on: 2010-06-24 16:31:01
Subject: redirectRegex throws type error
Assigned to: xqt
Original description:
Running MW 1.13.2, the following command throws a type error:

$ python add\_text.py -cat:Pages\_with\_too\_many\_expensive\_parser\_function\_calls -text:" " -summary:"Test edit:Category jog for \[\[:Category:Pages with too many expensive parser function calls|Pages with too many expensive parser function calls\]\]"

The result is:

Getting \[\[Category:Pages with too many expensive parser function calls\]\]...
Loading 2009 White House Forum on Health Reform/Related Articles...
Do you want to accept these changes? \(\[y\]es, \[N\]o, \[a\]ll\) a
Updating page \[\[2009 White House Forum on Health Reform/Related Articles\]\] via API
Loading 2010 United Kingdom general election/Related Articles...
Traceback \(most recent call last\):
File "add\_text.py", line 417, in <module>
main\(\)
File "add\_text.py", line 413, in main
create=talkPage\)
File "add\_text.py", line 201, in add\_text
text = page.get\(\)
File "/usr/local/src/python/pywikipedia/local\_sites/wikipedia.py", line 619, in get
self.\_contents = self.\_getEditPage\(get\_redirect = get\_redirect, throttle = throttle, sysop = sysop\)
File "/usr/local/src/python/pywikipedia/local\_sites/wikipedia.py", line 727, in \_getEditPage
m = self.site\(\).redirectRegex\(\).match\(pagetext\)
File "/usr/local/src/python/pywikipedia/local\_sites/wikipedia.py", line 6644, in redirectRegex
pattern = r'\(?:' + '|'.join\(keywords\) + '\)'
TypeError

version.py output is:

$ python version.py
Pywikipedia \[http\] trunk/pywikipedia \(r8311, 2010/06/22, 13:20:10\)
Python 2.5.2 \(r252:60911, Jan 20 2010, 21:48:48\) 
\[GCC 4.2.4 \(Ubuntu 4.2.4-1ubuntu3\)\]
config-settings:
use\_api = True
use\_api\_login = True

This error occurs due to the following bug in the code. At line 6642 is the following code fragment:

try:
keywords = self.getmagicwords\('redirect'\)
pattern = r'\(?:' + '|'.join\(keywords\) + '\)'
except KeyError:
\# no localized keyword for redirects
pattern = r'\#%s' % default

getmagicwords is a one line method that simply calls siteinfo \(line 5480\) with the key 'magicwords'. At line 5518, siteinfo calls getData to obtain site data. When looking for magicwords, the method executes "for entry in data\[key\]" at line 5527. For certain versions of MW, magicwords are not returned as part of the site data and therefore data\[key\] returns a null result. Eventually, this leads to the KeyError exception at line 5538.

The bug arises because siteinfo catches the KeyError exception and returns a result of "None". When the call is unwound back to line 6643 the provision for a KeyError at line 6645 is vacuous. The KeyError has already been caught by siteinfo.

Consequently, the statement at line 6644 executes. This causes a TypeError since the keyword arguement to .join\(\) is null.
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 04:54:50 UTC
Thanks a lot for analyzing it and these details. I'll fix it tomorrow.
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 04:54:51 UTC
- **assigned_to**: nobody --> xqt
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 04:54:53 UTC
fixed in r8329
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 04:54:55 UTC
- **status**: open --> closed
Comment 5 Kunal Mehta (Legoktm) 2013-10-05 04:54:57 UTC
- **status**: closed --> open
Comment 6 Kunal Mehta (Legoktm) 2013-10-05 04:54:59 UTC
The bug fix in r8329 doesn't correct the problem. This is perhaps because I mis-analyzed the problem. In fact the try ... except block in siteinfo accomplishes nothing, since the KeyError occurs outside its scope. So, what really happens is the exception occurs and propagates. However, the value returned on an exception is None. So, it propagates through getmagicwords to redirectRegex. For some reason I don't understand, it is not caught by the except clause there before the pattern statement executes \(causing the type error\).

The solution \(which I have tested\) is to put a try ... except block in getmagicwords and return None when a KeyError occurs. This consumes the KeyError exception and allows the change in r8333 to redirectRegex to work properly. In addition, it makes no sense to have the try ... except block in siteinfo, since it isn't possible for a KeyError to occur as the result of either of the two return statements.

I will attach a patch against r8333 that fixes the problem.
Comment 7 Kunal Mehta (Legoktm) 2013-10-05 04:55:01 UTC
patch against r8333 to fix the bug
Comment 8 Amir Ladsgroup 2014-07-25 12:19:46 UTC
The patch has been applied yet so It's still reproducible (I haven't checked though) since it's compat bug I mark it as low priority

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links