Last modified: 2013-12-26 14:42:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57313, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55313 - -hintfile: option
-hintfile: option
Status: NEW
Product: Pywikibot
Classification: Unclassified
interwiki.py (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 05:03 UTC by Kunal Mehta (Legoktm)
Modified: 2013-12-26 14:42 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 05:03:13 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/836/
Reported by: Anonymous user
Created on: 2009-01-23 23:29:52
Subject: -hintfile: option
Assigned to: purodha
Original description:
The newly introduced version -hintfile: is not well-documented or it's not working as expected. 

It asks for a page to be checked \(see below\) while \(according to \[2284955\] interwiki hints from file\) it's supposed to read both a local page and a hint page from file. Please fix it. Thanks\!

python interwiki.py -hintfile:
Please enter the hint filename: hints.txt
Which page to check:

Pywikipedia \[http\] trunk/pywikipedia \(r6291, Jan 23 2009, 16:08:14\)
Python 2.5.1 \(r251:54863, Apr 18 2007, 08:51:08\) \[MSC v.1310 32 bit \(Intel\)\]
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 05:03:15 UTC
Assigned to committer.
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 05:03:17 UTC
- **assigned_to**: nobody --> purodha
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 05:03:19 UTC
Assigned to committer.
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 05:03:20 UTC
Assigned to committer.
Comment 5 Kunal Mehta (Legoktm) 2013-10-05 05:03:22 UTC
Assigned to committer.
Comment 6 Kunal Mehta (Legoktm) 2013-10-05 05:03:24 UTC
What you want to have, in the above example, can be had with:

python interwiki.py  -v -hintfile: -file:
Pywikipediabot  \(r6439 \(wikipedia.py\), Feb 24 2009, 21:48:26\)
Python 2.5.2 \(r252:60911, Jan  4 2009, 21:59:32\) 
\[GCC 4.3.2\]
Please enter the hint filename: hints.txt
Please enter the local file name: local-page-title.txt

There is no documentation saying that -hintfile: was overriding or altering
the processing of any other parameter \(and in fact, it does not\)

Be aware that it is hardly useful to have a file with several page titles given
via -file: when -hintfile: is being used, since hints would apply to each of those
pages, provoking interwiki conflicts.
Thus -hintfile: is likely more often used with a singe page title on the command
line. That does not preclude, however, a single page title being read from a file
using -file:

If, and only if, the file given via -hintfile: has only unspecific hints, such as \[\[10:\]\]
or \[\[en:\]\] or \[\[latin:\]\], \(or all specific hinted pages do not exist\) then supplying a
list of pages via -file: would be likely free of conflicts.

There is a difference between hints and the page being processed. While for the
outcome, in properly preset cases, it is often irrelevant where the bot starts
processing, and which pages are then added because hinted, for the paths the
bot follows while collecting links, it does make a huge difference sometimes.
We can have hintless processing, but we cannot have a bot run on hints alone,
without a starting page.

Maybe we should add some of these to the documentation?  Is that, which you
are  asking for?
Comment 7 Kunal Mehta (Legoktm) 2013-10-05 05:03:26 UTC
No, it's not exactly what I asked for. In the original feature request \#2284955 \[http://sourceforge.net/tracker/index.php?func=detail&aid=2284955&group\_id=93107&atid=603141\], as far as I can see, the idea was to read both starting pages and hints from the same file, line per line, and to make an array of pages to be processed and relevant hints. 

\# \[\[:xx:page\_without\_interwiki\]\] \[\[:en:English\_page\_used\_as\_a\_hint\]\]

Working on a single page with -hintfile option doesn't seem to be that useful.
Comment 8 Kunal Mehta (Legoktm) 2013-10-05 05:03:27 UTC
I guess we need to combine "TextfilePageGenerator" from pagegenerators.py and "hintfile" from interwiki.py, so that both the page title and the hint are read, line by line, from the same hintfilename - page title from the first pair of brackets \[\[\]\], and the hint - from the second pair of brackets in the same line within hintfile. Is it possible to implement this, please?
Comment 9 Kunal Mehta (Legoktm) 2013-10-05 05:03:29 UTC
this simple code should be working for this purpose

f = codecs.open\(hintfilename, 'r', config.textfile\_encoding\)
R = re.compile\(ur'\\\[\\\[\:?\(.\*?\)\\\]\\\]\s+\\\[\\\[\:?\(.\*\)\\\]\\\]'\)
for line in R.findall\(f.read\(\)\):
pageTitle = line\[0\]
hintTitle = line\[1\]

just make a proper call to 

yield wikipedia.Page\(site, pageTitle\)

and

hints.append\(hintTitle\)
Comment 10 Kunal Mehta (Legoktm) 2013-10-05 05:03:31 UTC
anyone out there to take care of this?

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links