Last modified: 2013-10-05 05:01:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57307, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55307 - Section headers with templates are not correctly recognised
Section headers with templates are not correctly recognised
Status: NEW
Product: Pywikibot
Classification: Unclassified
interwiki.py (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 05:01 UTC by Kunal Mehta (Legoktm)
Modified: 2013-10-05 05:01 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 05:01:39 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/914/
Reported by: Anonymous user
Created on: 2009-04-20 16:13:14
Subject: Section headers with templates are not correctly recognised
Original description:
Sometimes bot removes valid interwiki which lads to anchor in other article.
See http://cs.wikipedia.org/w/index.php?title=Platnost\_\(pr%C3%A1vo\)&action=history
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 05:01:41 UTC
valhallasw@dorthonion:~/src/pywikipedia/trunk$ python interwiki.py cs:Platnost\_%28právo%29
Getting 1 page from wikipedia:cs...
\[\[cs:Platnost \(právo\)\]\]: \[\[cs:Platnost \(právo\)\]\] gives new interwiki \[\[de:Gültigkeit\#Gültigkeit im Recht\]\]
Getting 1 page from wikipedia:de...
NOTE: \[\[de:Gültigkeit\#Gültigkeit im Recht\]\] does not exist. Skipping.
======Post-processing \[\[cs:Platnost \(právo\)\]\]======
Updating links on page \[\[cs:Platnost \(právo\)\]\].
Changes to be made: Robot: Removing \[\[de:Gültigkeit\#Gültigkeit im Recht\]\]
\- \[\[de:Gültigkeit\#Gültigkeit im Recht\]\]

ERROR: Found incorrect link to de in \[\[cs:Platnost \(právo\)\]\]
Submit? \(\[y\]es, \[n\]o, open in \[b\]rowser, \[g\]ive up, \[a\]lways\)


tries to refer to
=== \{\{Anker|Rechtsg\xfcltig\}\}G\xfcltigkeit im Recht ===\

using \#G.C3.BCltigkeit\_im\_Recht instead does not help

removing \{\{Anker|...\}\} does work, if the \#Gültigkeit im Recht version is used...
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 05:01:43 UTC
- **labels**:  --> interwiki
- **milestone**:  --> confirmed
- **priority**: 5 --> 6
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 05:01:44 UTC
- **summary**: Removing of interwiki to anchor --> Section headers with templates are not correctly recognised
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 05:01:46 UTC
Based on the wikitext, it's hard to determine whether the section title is there \(evil regexp\). We could combine it with a fallback to the API with action=parse, i.e. 
http://de.wikipedia.org/w/api.php?action=parse&text=\{\{:G%C3%BCltigkeit\}\}

the rewrite doesn't raise SectionErrors altogether.

options:
\- stripping the check
\- try to get the regexp working
\- keep a simple regexp with an API fallback

questions
\- how to implement this in the rewrite?
Comment 5 Kunal Mehta (Legoktm) 2013-10-05 05:01:48 UTC
Long story short: it's impossible to do without another API query. We can use 

http://en.wikipedia.org/w/api.php?action=parse&prop=sections&page=Help:Editing

to do this. We cannot do it using regexps because of template expansions.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links