Last modified: 2014-11-09 19:38:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T74047, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 72047 - archivebot problems on cswiki
archivebot problems on cswiki
Status: RESOLVED FIXED
Product: Pywikibot
Classification: Unclassified
Cosmetic changes (Other open bugs)
core-(2.0)
All All
: Unprioritized major
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-14 18:45 UTC by JAn Dudík
Modified: 2014-11-09 19:38 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description JAn Dudík 2014-10-14 18:45:42 UTC
Some errors causes tahat not all pages should be correctly archived



I:\py\rewrite>pwb.py archivebot archive -lang:cs
...

1) incorrect month name, but this string "06." is in page only in urls :

Processing [[cs:Diskuse s wikipedistou:JAn Dudík]]
incorrect month name "06." in page in site wikipedia:cs
ERROR: Error occured while processing page [[cs:Diskuse s wikipedistou:JAn Dudík
]]
ERROR: KeyError:
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 614, in main
    archiver = PageArchiver(pg, a, salt, force)
  File "I:\py\rewrite\scripts\archivebot.py", line 383, in __init__
    self.page = DiscussionPage(page, self)
  File "I:\py\rewrite\scripts\archivebot.py", line 293, in __init__
    self.load_page()
  File "I:\py\rewrite\scripts\archivebot.py", line 321, in load_page
    cur_thread.feed_line(line)
  File "I:\py\rewrite\scripts\archivebot.py", line 238, in feed_line
    timestamp = self.ts.timestripper(line)
  File "I:\py\rewrite\pywikibot\textlib.py", line 1321, in timestripper
    raise KeyError
KeyError


2) attributeError: 'NoneType' object has no attribute 'group'
Processing [[cs:Diskuse s wikipedistou:JeremySil]]
19 Threads found on [[cs:Diskuse s wikipedistou:JeremySil]]
Looking for: {{archivace}} in [[cs:Diskuse s wikipedistou:JeremySil]]
ERROR: Error occured while processing page [[cs:Diskuse s wikipedistou:JeremySil
]]
ERROR: AttributeError: 'NoneType' object has no attribute 'group'
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
    whys = self.analyze_page()
  File "I:\py\rewrite\scripts\archivebot.py", line 453, in analyze_page
    max_arch_size = str2size(self.get_attr('maxarchivesize'))
  File "I:\py\rewrite\scripts\archivebot.py", line 173, in str2size
    val, unit = (int(r.group(1)), r.group(2))
AttributeError: 'NoneType' object has no attribute 'group'



3) When archive is in another path, bot fails:

Processing [[cs:Wikipedie:Byrokraté/Nástěnka]]
29 Threads found on [[cs:Wikipedie:Byrokraté/Nástěnka]]
Looking for: {{archivace}} in [[cs:Wikipedie:Byrokraté/Nástěnka]]
Processing 29 threads
ERROR: Error occured while processing page [[cs:Wikipedie:Byrokraté/Nástěnka]]
ERROR: ArchiveSecurityError: Archive page [[cs:Wikipedie:Byrokraté/Archiv1]] doe
s not start with page title (Wikipedie:Byrokraté/Nástěnka)!
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
    whys = self.analyze_page()
  File "I:\py\rewrite\scripts\archivebot.py", line 481, in analyze_page
    if self.feed_archive(archive, t, max_arch_size, params):
  File "I:\py\rewrite\scripts\archivebot.py", line 447, in feed_archive
    % (archive, self.page.title()))
ArchiveSecurityError: Archive page [[cs:Wikipedie:ByrokratĂ?/Archiv1]] does not
start with page title (Wikipedie:ByrokratĂ?/NástÄ?nka)!

4) unknown interwiki prefixes c: and outreach:
Processing [[cs:Wikipedie:Nástěnka správců]]
52 Threads found on [[cs:Wikipedie:Nástěnka správců]]
Looking for: {{archivace}} in [[cs:Wikipedie:Nástěnka správců]]
Processing 52 threads
127 Threads found on [[cs:Wikipedie:Nástěnka správců/Archiv58]]
Archiving 23 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Nástěnka správců]]
ERROR: SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wiki
pedia:cs, and the interwiki prefix c is not supported by PyWikiBot!
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
    self.archives[a].update(comment)
  File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
    self.save(summary)
  File "I:\py\rewrite\pywikibot\tools.py", line 516, in wrapper
    return obj(*__args, **__kw)
  File "I:\py\rewrite\pywikibot\page.py", line 985, in save
    **kwargs)
  File "I:\py\rewrite\pywikibot\page.py", line 993, in _save
    comment = self._cosmetic_changes_hook(comment) or comment
  File "I:\py\rewrite\pywikibot\page.py", line 1040, in _cosmetic_changes_hook
    self.text = ccToolkit.change(old)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
    new_text = self._change(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
    text = self.safe_execute(method, text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
    result = method(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
    'startspace'])
  File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
    replacement = new(match)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
    namespace = page.namespace()
  File "I:\py\rewrite\pywikibot\page.py", line 157, in namespace
    return self._link.namespace
  File "I:\py\rewrite\pywikibot\page.py", line 4153, in namespace
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4069, in parse
    self._text, self._site, prefix))
SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wikipedia:c
s, and the interwiki prefix c is not supported by PyWikiBot!

Processing [[cs:Wikipedie:Pod lípou (návrhy)]]
12 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou (návrhy)]]
Processing 12 threads
14 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)/Archiv 2014-01]]
Archiving 2 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou (návrhy)]]
ERROR: SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edit
ion/text is not a local page on wikipedia:cs, and the interwiki prefix outreach
is not supported by PyWikiBot!
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
    self.archives[a].update(comment)
  File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
    self.save(summary)
  File "I:\py\rewrite\pywikibot\tools.py", line 516, in wrapper
    return obj(*__args, **__kw)
  File "I:\py\rewrite\pywikibot\page.py", line 985, in save
    **kwargs)
  File "I:\py\rewrite\pywikibot\page.py", line 993, in _save
    comment = self._cosmetic_changes_hook(comment) or comment
  File "I:\py\rewrite\pywikibot\page.py", line 1040, in _cosmetic_changes_hook
    self.text = ccToolkit.change(old)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
    new_text = self._change(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
    text = self.safe_execute(method, text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
    result = method(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
    'startspace'])
  File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
    replacement = new(match)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
    namespace = page.namespace()
  File "I:\py\rewrite\pywikibot\page.py", line 157, in namespace
    return self._link.namespace
  File "I:\py\rewrite\pywikibot\page.py", line 4153, in namespace
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4069, in parse
    self._text, self._site, prefix))
SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edition/tex
t is not a local page on wikipedia:cs, and the interwiki prefix outreach is not
supported by PyWikiBot!
Comment 1 JAn Dudík 2014-10-14 18:55:00 UTC
d: is also unknown interwiki prefix
Comment 2 Mpaa 2014-10-14 19:09:31 UTC
I think you should split this is several bugs.
Comment 3 Fabian 2014-10-14 19:14:15 UTC
Hmmm that is strange. Can you test it without an apicache? For me all three prefixes are known:

>>> import pywikibot
>>> s = pywikibot.Site("cs", "wikipedia")
>>> s.interwiki("c")
Site("commons", "commons")
>>> s.interwiki("d")
Site("wikidata", "wikidata")
>>> s.interwiki("outreach")
Site("outreach", "outreach")
Comment 4 Mpaa 2014-10-14 19:59:03 UTC
Regarding point 1)

In textlib.py, in timestripper(self, line), line should be stripped of all items where a valid date cannot be (comments, links, external links, etc.).

Esp. external links might be deceptive, saw some like this in cswp: [https://lists.wikimedia.org/pipermail/mobile-l/2014-August/007927.html]

Best way to reuse already available textlib functions with minimum code duplication TBD. Some regexes might become global constants, maybe?
Comment 5 Amir Ladsgroup 2014-10-14 20:27:13 UTC
Maybe I can fix the point 1,2 and 3. but other issues are not things that I can look into
Comment 6 Mpaa 2014-10-14 20:51:18 UTC
Point 2)
I think template is just misused. Field maxarchivesize is empty.
See https://cs.wikipedia.org/w/index.php?title=Diskuse_s_wikipedistou:JeremySil&oldid=11741042
Comment 7 Fabian 2014-10-15 11:51:08 UTC
Okay even parsing a link does work as expected:

>>> s = pywikibot.Site("cs", "wikipedia")
>>> pywikibot.Link(":outreach:site", s)
pywikibot.page.Link('Site', Site("outreach", "outreach"))
>>> pywikibot.Link(":c:site", s)
pywikibot.page.Link('Site', Site("commons", "commons"))
>>> pywikibot.Link(":d:site", s)
pywikibot.page.Link('Site', Site("wikidata", "wikidata"))
Comment 8 Mpaa 2014-10-18 21:49:40 UTC
Uploaded patch https://gerrit.wikimedia.org/r/#/c/167406/

Please report if errors are still present after this is approved.
Comment 9 JAn Dudík 2014-10-19 21:24:44 UTC
(In reply to Mpaa from comment #8)
> Uploaded patch https://gerrit.wikimedia.org/r/#/c/167406/
> 
> Please report if errors are still present after this is approved.


Prefixes c:, outreach: and d: are still problematic.
here are errors from log:


Processing [[cs:Diskuse s wikipedistou:JeremySil]]
19 Threads found on [[cs:Diskuse s wikipedistou:JeremySil]]
Looking for: {{archivace}} in [[cs:Diskuse s wikipedistou:JeremySil]]
ERROR: Error occured while processing page [[cs:Diskuse s wikipedistou:JeremySil
]]
ERROR: ValueError: invalid literal for int() with base 10: ''
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
    whys = self.analyze_page()
  File "I:\py\rewrite\scripts\archivebot.py", line 454, in analyze_page
    arch_counter = int(self.get_attr('counter', '1'))
ValueError: invalid literal for int() with base 10: ''


Processing [[cs:Wikipedie:Nástěnka správců]]
55 Threads found on [[cs:Wikipedie:Nástěnka správců]]
Looking for: {{archivace}} in [[cs:Wikipedie:Nástěnka správců]]
Processing 55 threads
127 Threads found on [[cs:Wikipedie:Nástěnka správců/Archiv58]]
Archiving 29 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Nástěnka správců]]
ERROR: SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wiki
pedia:cs, and the interwiki prefix c is not supported by PyWikiBot!
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
    self.archives[a].update(comment)
  File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
    self.save(summary)
  File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
    return obj(*__args, **__kw)
  File "I:\py\rewrite\pywikibot\page.py", line 982, in save
    **kwargs)
  File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
    comment = self._cosmetic_changes_hook(comment) or comment
  File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
    self.text = ccToolkit.change(old)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
    new_text = self._change(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
    text = self.safe_execute(method, text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
    result = method(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
    'startspace'])
  File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
    replacement = new(match)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
    namespace = page.namespace()
  File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
    return self._link.namespace
  File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
    self._text, self._site, prefix))
SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wikipedia:c
s, and the interwiki prefix c is not supported by PyWikiBot!

Processing [[cs:Wikipedie:Pod lípou]]
32 Threads found on [[cs:Wikipedie:Pod lípou]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou]]
Processing 32 threads
63 Threads found on [[cs:Wikipedie:Pod lípou/Archiv 2014/02]]
13 Threads found on [[cs:Wikipedie:Pod lípou/Archiv 2014/03]]
Archiving 22 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou]]
ERROR: SiteDefinitionError: :c:Category:2014 events in Brno is not a local page
on wikipedia:cs, and the interwiki prefix c is not supported by PyWikiBot!
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
    self.archives[a].update(comment)
  File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
    self.save(summary)
  File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
    return obj(*__args, **__kw)
  File "I:\py\rewrite\pywikibot\page.py", line 982, in save
    **kwargs)
  File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
    comment = self._cosmetic_changes_hook(comment) or comment
  File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
    self.text = ccToolkit.change(old)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
    new_text = self._change(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
    text = self.safe_execute(method, text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
    result = method(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
    'startspace'])
  File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
    replacement = new(match)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
    namespace = page.namespace()
  File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
    return self._link.namespace
  File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
    self._text, self._site, prefix))
SiteDefinitionError: :c:Category:2014 events in Brno is not a local page on wiki
pedia:cs, and the interwiki prefix c is not supported by PyWikiBot!


Processing [[cs:Wikipedie:Pod lípou (návrhy)]]
15 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou (návrhy)]]
Processing 15 threads
14 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)/Archiv 2014-01]]
Archiving 5 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou (návrhy)]]
ERROR: SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edit
ion/text is not a local page on wikipedia:cs, and the interwiki prefix outreach
is not supported by PyWikiBot!
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
    self.archives[a].update(comment)
  File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
    self.save(summary)
  File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
    return obj(*__args, **__kw)
  File "I:\py\rewrite\pywikibot\page.py", line 982, in save
    **kwargs)
  File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
    comment = self._cosmetic_changes_hook(comment) or comment
  File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
    self.text = ccToolkit.change(old)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
    new_text = self._change(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
    text = self.safe_execute(method, text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
    result = method(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
    'startspace'])
  File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
    replacement = new(match)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
    namespace = page.namespace()
  File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
    return self._link.namespace
  File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
    self._text, self._site, prefix))
SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edition/tex
t is not a local page on wikipedia:cs, and the interwiki prefix outreach is not
supported by PyWikiBot!

Processing [[cs:Wikipedie:Pod lípou (technika)]]
61 Threads found on [[cs:Wikipedie:Pod lípou (technika)]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou (technika)]]
Processing 61 threads
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou (technika)]]

ERROR: IsRedirectPage: Page [[cs:Wikipedie:Pod lípou (technika)/Archiv 2014-01]]
 is a redirect page.
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
    whys = self.analyze_page()
  File "I:\py\rewrite\scripts\archivebot.py", line 481, in analyze_page
    if self.feed_archive(archive, t, max_arch_size, params):
  File "I:\py\rewrite\scripts\archivebot.py", line 449, in feed_archive
    self.archives[title] = DiscussionPage(archive, self, params)
  File "I:\py\rewrite\scripts\archivebot.py", line 293, in __init__
    self.load_page()
  File "I:\py\rewrite\scripts\archivebot.py", line 308, in load_page
    lines = self.get().split('\n')
  File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
    return obj(*__args, **__kw)
  File "I:\py\rewrite\pywikibot\page.py", line 332, in get
    self._getInternals(sysop)
  File "I:\py\rewrite\pywikibot\page.py", line 364, in _getInternals
    raise self._getexception
IsRedirectPage: Page [[cs:Wikipedie:Pod lĂ­pou (technika)/Archiv 2014-01]] is a
redirect page.

Processing [[cs:Wikipedie:Potřebuji pomoc]]
44 Threads found on [[cs:Wikipedie:Potřebuji pomoc]]
Looking for: {{archivace}} in [[cs:Wikipedie:Potřebuji pomoc]]
Processing 44 threads
7 Threads found on [[cs:Wikipedie:Potřebuji pomoc/Archiv14]]
Archiving 20 thread(s).


ERROR: Error occured while processing page [[cs:Wikipedie:Potřebuji pomoc]]
ERROR: SiteDefinitionError: d:Q252189 is not a local page on wikipedia:cs, and t
he interwiki prefix d is not supported by PyWikiBot!
Traceback (most recent call last):
  File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
    archiver.run()
  File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
    self.archives[a].update(comment)
  File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
    self.save(summary)
  File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
    return obj(*__args, **__kw)
  File "I:\py\rewrite\pywikibot\page.py", line 982, in save
    **kwargs)
  File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
    comment = self._cosmetic_changes_hook(comment) or comment
  File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
    self.text = ccToolkit.change(old)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
    new_text = self._change(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
    text = self.safe_execute(method, text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
    result = method(text)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
    'startspace'])
  File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
    replacement = new(match)
  File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
    namespace = page.namespace()
  File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
    return self._link.namespace
  File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
    self._text, self._site, prefix))
SiteDefinitionError: d:Q252189 is not a local page on wikipedia:cs, and the inte
rwiki prefix d is not supported by PyWikiBot!
Comment 10 Fabian 2014-10-19 21:52:07 UTC
Can you execute "python pwb.py shell", then in the Python console "import pywikibot" and "s = pywikibot.Site('cs', 'wikipedia')". Then try one of the three:

* pywikibot.Link(":outreach:site", s)
* pywikibot.Link(":c:site", s)
* pywikibot.Link(":d:site", s)

I personally have no problem executing this:

$ python pwb.py shell
Welcome to the Pywikibot interactive shell!
>>> import pywikibot
>>> s = pywikibot.Site("cs", "wikipedia")
>>> pywikibot.Link(":outreach:site", s)
pywikibot.page.Link('Site', Site("outreach", "outreach"))

You could also call "s.interwiki('outreach')" to see what it is returning.
Comment 11 Mpaa 2014-10-19 22:07:30 UTC
Processing [[cs:Diskuse s wikipedistou:JeremySil]]
...
    arch_counter = int(self.get_attr('counter', '1'))
ValueError: invalid literal for int() with base 10: ''

counter parameter is malformed. It is empty while an int is expected.
Comment 12 JAn Dudík 2014-10-20 14:16:57 UTC
(In reply to Fabian from comment #10)
This might be the problem:

I:\py\rewrite>pwb.py shell
Welcome to the Pywikibot interactive shell!
>>> import pywikibot
>>> s = pywikibot.Site('cs', 'wikipedia')
>>> pywikibot.Link(":outreach:site", s)

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "I:\py\rewrite\pywikibot\page.py", line 3975, in __repr__
    return "pywikibot.page.Link(%r, %r)" % (self.title, self.site)
  File "I:\py\rewrite\pywikibot\page.py", line 4151, in title
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
    self._text, self._site, prefix))
SiteDefinitionError: :outreach:site is not a local page on wikipedia:cs, and the
 interwiki prefix outreach is not supported by PyWikiBot!
>>>

I:\py\rewrite>pwb.py version
Pywikibot: pywikibot-core (74beb9e, s5354, 2014/10/20, 13:36:31, OUTDATED)
Release version: 2.0b2
Python: 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)]
unicode test: ok
httplib2 version: 0.8
Comment 13 Fabian 2014-10-20 14:36:39 UTC
Okay maybe the families are not loaded properly? Have you changed the "installed" families? With the following code line (in the shell like before) it'll list all families:

print("\n".join("{}: {}".format(*i) for i in sorted(pywikibot.config2.family_files.items())))

The default installation should contain at least outreach, wikidata and commons.
Comment 14 JAn Dudík 2014-10-20 17:49:31 UTC
(In reply to Fabian from comment #13)
> Okay maybe the families are not loaded properly? Have you changed the
> "installed" families? With the following code line (in the shell like
> before) it'll list all families:
> 
> print("\n".join("{}: {}".format(*i) for i in
> sorted(pywikibot.config2.family_files.items())))
> 
> The default installation should contain at least outreach, wikidata and
> commons.

I have all files which are in nightly, I didn't change anything

I:\py\rewrite>pwb.py shell
Welcome to the Pywikibot interactive shell!
>>> import pywikibot
>>> s = pywikibot.Site('cs', 'wikipedia')
>>> pywikibot.Link(":outreach:site", s)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "I:\py\rewrite\pywikibot\page.py", line 3975, in __repr__
    return "pywikibot.page.Link(%r, %r)" % (self.title, self.site)
  File "I:\py\rewrite\pywikibot\page.py", line 4151, in title
    self.parse()
  File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
    self._text, self._site, prefix))
SiteDefinitionError: :outreach:site is not a local page on wikipedia:cs, and the
 interwiki prefix outreach is not supported by PyWikiBot!
>>> print("\n".join("{}: {}".format(*i) for i in sorted(pywikibot.config2.family
_files.items())))
anarchopedia: I:\py\rewrite\pywikibot\families\anarchopedia_family.py
battlestarwiki: I:\py\rewrite\pywikibot\families\battlestarwiki_family.py
commons: I:\py\rewrite\pywikibot\families\commons_family.py
fon: I:\py\rewrite\pywikibot\families\fon_family.py
gentoo: I:\py\rewrite\pywikibot\families\gentoo_family.py
i18n: I:\py\rewrite\pywikibot\families\i18n_family.py
incubator: I:\py\rewrite\pywikibot\families\incubator_family.py
lockwiki: I:\py\rewrite\pywikibot\families\lockwiki_family.py
lyricwiki: I:\py\rewrite\pywikibot\families\lyricwiki_family.py
mediawiki: I:\py\rewrite\pywikibot\families\mediawiki_family.py
meta: I:\py\rewrite\pywikibot\families\meta_family.py
oldwikivoyage: I:\py\rewrite\pywikibot\families\oldwikivoyage_family.py
omegawiki: I:\py\rewrite\pywikibot\families\omegawiki_family.py
osm: I:\py\rewrite\pywikibot\families\osm_family.py
outreach: I:\py\rewrite\pywikibot\families\outreach_family.py
southernapproach: I:\py\rewrite\pywikibot\families\southernapproach_family.py
species: I:\py\rewrite\pywikibot\families\species_family.py
strategy: I:\py\rewrite\pywikibot\families\strategy_family.py
test: I:\py\rewrite\pywikibot\families\test_family.py
vikidia: I:\py\rewrite\pywikibot\families\vikidia_family.py
wikia: I:\py\rewrite\pywikibot\families\wikia_family.py
wikibooks: I:\py\rewrite\pywikibot\families\wikibooks_family.py
wikidata: I:\py\rewrite\pywikibot\families\wikidata_family.py
wikimedia: I:\py\rewrite\pywikibot\families\wikimedia_family.py
wikinews: I:\py\rewrite\pywikibot\families\wikinews_family.py
wikipedia: I:\py\rewrite\pywikibot\families\wikipedia_family.py
wikiquote: I:\py\rewrite\pywikibot\families\wikiquote_family.py
wikisource: I:\py\rewrite\pywikibot\families\wikisource_family.py
wikitech: I:\py\rewrite\pywikibot\families\wikitech_family.py
wikiversity: I:\py\rewrite\pywikibot\families\wikiversity_family.py
wikivoyage: I:\py\rewrite\pywikibot\families\wikivoyage_family.py
wiktionary: I:\py\rewrite\pywikibot\families\wiktionary_family.py
wowwiki: I:\py\rewrite\pywikibot\families\wowwiki_family.py
>>>

the file
i:\py\rewrite\pywikibot\families\outreach_family.py
is the only one containing 'outreach' in whole core package

But I don't understand why archivebot needs to know interwiki links - it should only take part of text and move it to another page.
Comment 15 Mpaa 2014-10-20 20:57:21 UTC
Works for me.

Pywikibot: pywikibot-core.git (bc34a4a, g4328, 2014/10/20, 18:56:56, OUTDATED)
Release version: 2.0b2
Python: 2.7.6 (default, Mar 22 2014, 22:59:38)
unicode test: ok
httplib2 version: 0.9

The only difference I can spot is httplib2. No idea if it could explain it.
Might be worth while updating it and retry.
Comment 16 JAn Dudík 2014-11-04 09:46:50 UTC
I found, taht problem is somewhere in cosmetic_changes.py.
after disabling, bot archives well.
Comment 17 Mpaa 2014-11-04 22:23:42 UTC
(In reply to Mpaa from comment #8)
> Uploaded patch https://gerrit.wikimedia.org/r/#/c/167406/
> 
> Please report if errors are still present after this is approved.

BTW, patch https://gerrit.wikimedia.org/r/#/c/167406/ is Merged.
Comment 18 Fabian 2014-11-04 22:35:48 UTC
Hmmm about the interwiki problem: It doesn't really matter why it does also parse interwiki links (*) because otherwise we would just hide a potential bug.

Could you execute the following line in shell:

pywikibot.Site('cs', 'wikipedia').interwiki('outreach')

This returns for me 'Site("outreach", "outreach")'.

*: There is no reliable way to know if a link is an interwiki link or not without looking into the prefix. The script just goes the extra mile to also assign a Site object to it which could be ignored but that must be done by the script by catching the exception.
Comment 19 John Mark Vandenberg 2014-11-09 19:38:24 UTC
It looks like this is fixed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links