Last modified: 2014-09-18 12:50:40 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/327/ Reported by: n-fran Created on: 2013-01-25 14:38:46 Subject: standardize_notes.py encoding Original description: If I want to add to the script text of russian letters, is this error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range\(128\) To avoid this error, I think, it is necessary to register in the code bot these or any of the other lines: \# -\*- coding: utf-8 -\*- import sys reload\(sys\) sys.setdefaultencoding\('utf-8'\) And my bot started to function. Thanks.
I cannot follow what you mean with "add to the script". Do you want to modify the script or enter russian characters on the command line? What is the complete error you got. Did you set your transliteration\_target and console\_encoding in your user-config.py reload\(sys\) after import sys does not matter since it just reloads the same module
Sorry, my knowledge of the English language, particularly on the part of the technical terms, it may be bad. I meant that I was putting in Russian characters in the file standardize\_notes.py . For example, I changed the '\n== Notes ==\n' to '\n== Примечания ==\n' \(line 987\), and then this error appeared: http://pastebin.ru/yzh2CdvX When I added in the beginning of the text file, which is pointed out above, the problem disappeared. Thank you.
In my user-config.py there are lines console\_encoding = 'cp1251' transliteration\_target = console\_encoding but the problems with the coding still a lot. Thank you.
While using python 2.X there are two kind of stings: ASCII strings are noted like "This is a ascii string" unicode strings are noted like u"This is a unicode string" Just write a u before that sting in line 987 \(and remove that reload/encoding stuff\): new\_text = new\_text + u'\n== Notes ==\n' \# set to standard name But ok, this part should be localized
- **priority**: 5 --> 3