Last modified: 2013-10-05 05:01:50 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/914/ Reported by: Anonymous user Created on: 2009-04-20 16:13:14 Subject: Section headers with templates are not correctly recognised Original description: Sometimes bot removes valid interwiki which lads to anchor in other article. See http://cs.wikipedia.org/w/index.php?title=Platnost\_\(pr%C3%A1vo\)&action=history
valhallasw@dorthonion:~/src/pywikipedia/trunk$ python interwiki.py cs:Platnost\_%28právo%29 Getting 1 page from wikipedia:cs... \[\[cs:Platnost \(právo\)\]\]: \[\[cs:Platnost \(právo\)\]\] gives new interwiki \[\[de:Gültigkeit\#Gültigkeit im Recht\]\] Getting 1 page from wikipedia:de... NOTE: \[\[de:Gültigkeit\#Gültigkeit im Recht\]\] does not exist. Skipping. ======Post-processing \[\[cs:Platnost \(právo\)\]\]====== Updating links on page \[\[cs:Platnost \(právo\)\]\]. Changes to be made: Robot: Removing \[\[de:Gültigkeit\#Gültigkeit im Recht\]\] \- \[\[de:Gültigkeit\#Gültigkeit im Recht\]\] ERROR: Found incorrect link to de in \[\[cs:Platnost \(právo\)\]\] Submit? \(\[y\]es, \[n\]o, open in \[b\]rowser, \[g\]ive up, \[a\]lways\) tries to refer to === \{\{Anker|Rechtsg\xfcltig\}\}G\xfcltigkeit im Recht ===\ using \#G.C3.BCltigkeit\_im\_Recht instead does not help removing \{\{Anker|...\}\} does work, if the \#Gültigkeit im Recht version is used...
- **labels**: --> interwiki - **milestone**: --> confirmed - **priority**: 5 --> 6
- **summary**: Removing of interwiki to anchor --> Section headers with templates are not correctly recognised
Based on the wikitext, it's hard to determine whether the section title is there \(evil regexp\). We could combine it with a fallback to the API with action=parse, i.e. http://de.wikipedia.org/w/api.php?action=parse&text=\{\{:G%C3%BCltigkeit\}\} the rewrite doesn't raise SectionErrors altogether. options: \- stripping the check \- try to get the regexp working \- keep a simple regexp with an API fallback questions \- how to implement this in the rewrite?
Long story short: it's impossible to do without another API query. We can use http://en.wikipedia.org/w/api.php?action=parse&prop=sections&page=Help:Editing to do this. We cannot do it using regexps because of template expansions.