Last modified: 2014-02-21 00:06:40 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59351, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 57351 - To add some basic djvulibre calls to API


Summary:	To add some basic djvulibre calls to API

Status:	NEW

Product:	MediaWiki
Classification:	Unclassified
Component:	API (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Low enhancement (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	Wikisource
	Show dependency tree / graph

Reported:	2013-11-21 12:56 UTC by Alessandro Brollo
Modified:	2014-02-21 00:06 UTC (History)
CC List:	5 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Alessandro Brollo 2013-11-21 12:56:45 UTC

Djvu files, the main image+text multipage file used by wikisiurce projects, has a very interesting txt layer but mediawiki can't access to it (but rough extraction of puure text). Some new API actions, added both to Commons and to Wikisource projects API, would allow to retrieve most interesting data from the whole file or for selected pages. While reading functions are safe, writing functions can be destructive, even if they could be very useful to advanced users; so I think that first step would be to implement only read-only functions.

djvutext (to read structured text in lisp-like syntax) and djvutoxml (to extract structured text in xml) would be IMHO the first two routines to implement.

Comment 1 Brion Vibber 2013-11-21 16:29:37 UTC

It would be best if possible to have a common interface for multiple file types; PDF also can embed text for instance.

What sort of data format are you envisioning, and what sort of uses?

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links