Last modified: 2013-07-25 10:31:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T39455, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 37455 - Implement image tracking in the monuments database


Summary:	Implement image tracking in the monuments database

Status:	NEW

Product:	Wiki Loves Monuments
Classification:	Unclassified
Component:	Database (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Low enhancement
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2012-06-10 19:50 UTC by Maarten Dammers
Modified:	2013-07-25 10:31 UTC (History)
CC List:	3 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Maarten Dammers 2012-06-10 19:50:58 UTC

The monuments database should contain a table with all the images at Commons which have a valid identifier.

The code to extract the identifiers is already available in the unused images bot (https://fisheye.toolserver.org/browse/erfgoed/erfgoedbot/unused_monument_images.py?hb=true).

The bot should get valid template/tracker categories from the configuration (https://fisheye.toolserver.org/browse/erfgoed/erfgoedbot/monuments_config.py?hb=true)

Loop over these and for each source get all the images + metadata.

Comment 1 AleXXw 2012-06-10 20:40:47 UTC

The database should at least contain:
* Filename
* Monuments ID as in templates
* Uploader
* Upload date
* Is there {{Wiki Loves Monuments yyyy}} and year
Optional:
* coordinates
* categories
* image resolution
* file size

Comment 2 Maarten Dammers 2012-10-01 20:08:14 UTC

Probably best to start with a minimal implementation where wikitext parsing is not needed. Existing tools can be converted to make use of this table. Later this information can be extended.

Comment 3 Maarten Dammers 2012-12-01 21:42:58 UTC

Did a first implementation. 

 CREATE TABLE `image` (
  `country` varchar(10) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '',
  `id` varchar(25) NOT NULL DEFAULT '0',
  `img_name` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '',
  PRIMARY KEY (`country`,`id`,`img_name`),
  KEY `country_id` (`country`,`id`),
  KEY `img_name` (`img_name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

mysql> SELECT COUNT(*) FROM image;
+----------+
| COUNT(*) |
+----------+
|   878926 |
+----------+
1 row in set (0.00 sec)

Playing around with api at http://toolserver.org/~multichill/monapi/api.php?action=images&country=pl&id=MA/A-1028&format=json&width=2000&limit=999999

Comment 4 AleXXw 2012-12-02 11:29:53 UTC

Great start, thx Maarten!

Beside the additional wished fields I noticed some errors, most should be no problem for you ;)

* IDs are filled with zeroes (28985 appears as 00028985)
* IDs got uppercase (ArD-9-006 appears as ARD-9-006)
* some IDs are the uppercase image name (ie 'WIGANDG 29.JPG')
* Pictures with more than one ID-template are just once in the table, but should be as often as there are templates (ie http://commons.wikimedia.org/wiki/File:Murtalbahnbr%C3%BCcke_1.JPG)

Comment 5 Maarten Dammers 2012-12-02 20:18:50 UTC

I think I've fixed this zero problem already, just haven't updated the database yet. Same for the lowercase/uppercase thing. I'm struggling a bit with how to do the padding on Commons. Currently it's padded with '0', but this is causing problems in the USA. The NRHP uses the first two characters for the year so all the nominations in 2000 (00xxxx) get their two zero's chopped off.

I use the categorylinks table for the id's. Multiple templates adding the same category just gives one entry, so that's what I'm using.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links