Last modified: 2014-09-18 12:50:40 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57018, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55018 - standardize_notes.py encoding
standardize_notes.py encoding
Status: NEW
Product: Pywikibot
Classification: Unclassified
Other scripts (Other open bugs)
compat-(1.0)
All All
: Unprioritized enhancement
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 04:04 UTC by Kunal Mehta (Legoktm)
Modified: 2014-09-18 12:50 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:04:43 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/327/
Reported by: n-fran
Created on: 2013-01-25 14:38:46
Subject: standardize_notes.py encoding
Original description:
If I want to add to the script text of russian letters, is this error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range\(128\)

To avoid this error, I think, it is necessary to register in the code bot these or any of the other lines:

\# -\*- coding: utf-8  -\*-
import sys
reload\(sys\)
sys.setdefaultencoding\('utf-8'\)

And my bot started to function. Thanks.
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 04:04:45 UTC
I cannot follow what you mean with "add to the script". Do you want to modify the script or enter russian characters on the command line?

What is the complete error you got.

Did you set your transliteration\_target and console\_encoding in your user-config.py

reload\(sys\) after import sys does not matter since it just reloads the same module
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 04:04:47 UTC
Sorry, my knowledge of the English language, particularly on the part of the technical terms, it may be bad. I meant that I was putting in Russian characters in the file standardize\_notes.py . For example, I changed the '\n== Notes ==\n' to '\n== Примечания ==\n' \(line 987\), and then this error appeared:

http://pastebin.ru/yzh2CdvX 

When I added in the beginning of the text file, which is pointed out above, the problem disappeared. Thank you.
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 04:04:49 UTC
In my user-config.py there are lines

console\_encoding = 'cp1251'
transliteration\_target = console\_encoding

but the problems with the coding still a lot. Thank you.
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 04:04:51 UTC
While using python 2.X there are two kind of stings:
ASCII strings are noted like "This is a ascii string"
unicode strings are noted like u"This is a unicode string"

Just write a u before that sting in line 987 \(and remove that reload/encoding stuff\):
new\_text = new\_text + u'\n== Notes ==\n'    \# set to standard name

But ok, this part should be localized
Comment 5 Kunal Mehta (Legoktm) 2013-10-05 04:04:52 UTC
- **priority**: 5 --> 3

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links