Last modified: 2013-10-05 05:02:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57311, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 55311 - -weblink case sensitive


Summary:	-weblink case sensitive

Status:	NEW

Product:	Pywikibot
Classification:	Unclassified
Component:	General (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Unprioritized normal
Target Milestone:	---
Assigned To:	Pywikipedia bugs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2013-10-05 05:02 UTC by Kunal Mehta (Legoktm)
Modified:	2013-10-05 05:02 UTC (History)
CC List:	0 users

See Also:	https://sourceforge.net/p/pywikipediabot/bugs/861
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Kunal Mehta (Legoktm) 2013-10-05 05:02:40 UTC

Originally from: http://sourceforge.net/p/pywikipediabot/bugs/861/
Reported by: platonides
Created on: 2009-02-18 15:49:13
Subject: -weblink case sensitive
Original description:
If you do use -weblink with an upper case parameter \(or lowercase if the links are uppercase at the wiki\), it treats the link as a page. Eg. for -weblink:\*.COM and \[u'http://www.example.com', u'Foo'\] it outputs:
Page \[\[Http://www.example.com\]\] not found
No changes were necessary in \[\[Foo\]\]


Running r6366. Python 2.5.1

Comment 1 Kunal Mehta (Legoktm) 2013-10-05 05:02:42 UTC

Okay, understood.

Actually, -weblink "case" does not change the replacements behavior. 
-weblink is, at all times, case insensitive:
Here, if \[\[Foo\]\] contains 'Http://www.example.com', it will be matched by mediawiki as a \*.COM address, and will give it to you on Special:LinkSearch.

It's not -weblink's case which matters here, it is the case \(in\)sensivity of the replacements used by replace.py

To illustrate what I'm saying:
with http://fr.wikipedia.org/wiki/Utilisateur:NicDumZ/casetest containing "http://Case-Linky.COM" :
python replace.py -weblink:"Case-Linky.COM" "case-linky.com" "case-linky.rs" treats the test page but don't change anything
python replace.py -weblink:"case-linky.com" "case-linky.com" "case-linky.rs" treats the test page but don't change anything either

The page is matched as containing a case-linky.com link, because mediawiki treats links case-insensitively. But when PYWP tries to match replacements, it's case sensitive by default ;\)

Add -nocase for case insensivity :
python replace.py -weblink:"case-linky.com" -nocase "case-linky.com" "case-linky.rs" DO the changes =\)


I'll close this bug as INVALID, re-open it if I misunderstood the issue =\)

\(But I think that you wanted the .yu top level domain fixes to be case-insensitive, right ? Well this was not possible, even with a -nocase parameter added, because -nocase is ignored when -fix: is used. But since http://svn.wikimedia.org/viewvc/pywikipedia?view=rev&revision=6374 the yu-tld fixes are case-insensitive. =\) \)

Comment 2 Kunal Mehta (Legoktm) 2013-10-05 05:02:44 UTC

- **status**: open --> pending-invalid

Comment 3 Kunal Mehta (Legoktm) 2013-10-05 05:02:45 UTC

Comment: the fact that the -linksearch pagegenerator yields the link itself \(see bug description, "Page \[\[Http://www.example.com\]\] not found"\) is a regex bug, and is not related to the current bug. I have a patch ready, will commit asap.

Comment 4 Kunal Mehta (Legoktm) 2013-10-05 05:02:47 UTC

I don't mean that. Seems that by generalising it wasn't clear enough.
At your example, run  python replace.py -weblink:"Case-Linky.com" "case-linky.com" "case-linky.rs"
Note that now the case of the weblink parameter \(Case-Linky.com\) doesn't match the case of the real link \(http://Case-Linky.COM\)
You will get a wrong 'Page \[\[Http://Case-Linky.COM\]\] not found' message. It is detecting two pages, http://Case-Linky.COM and Utilisateur:NicDumZ/casetest but only Utilisateur:NicDumZ/casetest is a pagename.

Comment 5 Kunal Mehta (Legoktm) 2013-10-05 05:02:49 UTC

- **status**: pending-invalid --> open-invalid

Comment 6 Kunal Mehta (Legoktm) 2013-10-05 05:02:51 UTC

I can find that bug, but from the description seems this bug is a duplicate.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links