Last modified: 2014-07-20 11:01:37 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57318, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55318 - UnicodeEncodeError in weblinkchecker.py
UnicodeEncodeError in weblinkchecker.py
Status: RESOLVED DUPLICATE of bug 55145
Product: Pywikibot
Classification: Unclassified
weblinkchecker.py (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 05:04 UTC by Kunal Mehta (Legoktm)
Modified: 2014-07-20 11:01 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 05:04:43 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/789/
Reported by: wikishizhao
Created on: 2008-09-03 14:56:27
Subject: weblinkchecker.py error
Original description:
see: 

Exception in thread 中華民國國旗 - http://law.moj.gov.tw/Scripts/Query1A.asp?no=1D0020020&K1=國旗:
Traceback \(most recent call last\):
File "/usr/lib/python2.5/threading.py", line 486, in \_\_bootstrap\_inner
self.run\(\)
File "weblinkchecker.py", line 504, in run
linkChecker = LinkChecker\(self.url, HTTPignore = self.HTTPignore\)
File "weblinkchecker.py", line 302, in \_\_init\_\_
self.changeUrl\(url\)
File "weblinkchecker.py", line 357, in changeUrl
self.query = unicode\(urllib.quote\(self.query.encode\(encoding\), '=&'\)\)
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 17-18: ordinal not in range\(256\)
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 05:04:45 UTC
Logged In: YES 
user\_id=1853316
Originator: YES

and:

Exception while processing URL http://sat2.hp.infoseek.co.jp/taiwan/cts.jpg in page 中華電視公司
Exception in thread 中華電視公司 - http://sat2.hp.infoseek.co.jp/taiwan/cts.jpg:
Traceback \(most recent call last\):
File "/usr/lib/python2.5/threading.py", line 486, in \_\_bootstrap\_inner
self.run\(\)
File "weblinkchecker.py", line 506, in run
ok, message = linkChecker.check\(\)
File "weblinkchecker.py", line 437, in check
msg = error\[1\]
IndexError: tuple index out of range
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 05:04:47 UTC
I am getting the same error \(tuple index out of range\) reported by wikishizhao.  It occurs for many URLs.

Platform:

Pywikipedia \[http\] trunk/pywikipedia \(r5880, Sep 07 2008, 21:16:02\)
Python 2.4.3 \(\#1, May 24 2008, 13:57:05\) 
\[GCC 4.1.2 20070626 \(Red Hat 4.1.2-14\)\]

Example:

Exception while processing URL http://core3.bsn.endeca.com/wine45/controller.jsp?N=0 in page Advanced gallery vendor blog
Exception in thread Advanced gallery vendor blog - http://core3.bsn.endeca.com/wine45/controller.jsp?N=0:
Traceback \(most recent call last\):
File "/usr/lib64/python2.4/threading.py", line 442, in \_\_bootstrap
self.run\(\)
File "weblinkchecker.py", line 506, in run
ok, message = linkChecker.check\(\)
File "weblinkchecker.py", line 437, in check
msg = error\[1\]
IndexError: tuple index out of range
Comment 3 Ricordisamoa 2014-07-20 11:01:37 UTC

*** This bug has been marked as a duplicate of bug 55145 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links