Last modified: 2014-10-13 14:03:39 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73989, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 71989 - Installation of pdf2djvu


Summary:	Installation of pdf2djvu

Status:	NEW

Product:	Tool Labs tools
Classification:	Unclassified
Component:	Other (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Unprioritized normal
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2014-10-13 10:00 UTC by billinghurst
Modified:	2014-10-13 14:03 UTC (History)
CC List:	0 users

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description billinghurst 2014-10-13 10:00:00 UTC

Would someone be so kind to install pdf2djvu on labs. Thanks.

Comment 1 Andre Klapper 2014-10-13 12:17:03 UTC

Providing usecases is welcome. :)

Comment 2 billinghurst 2014-10-13 12:50:24 UTC

I have occasional use to convert public domain files found in pdf format. I need to trim components, and we have PDFtk on labs that suits that purpose. While they can be loaded to Commons for Wikisource as PDF files, they are generally inferior in retaining line by line text so not as useful for Wikisources.  This will enable me to grab, trim, and convert files from labs, then push to Commons.

An example is https://commons.wikimedia.org/wiki/File:Electoral_Disabilities_of_Women.pdf which I have uploaded, though due to the poor pdf rendering, I am needing to separately OCR (PITA).

At the moment, I am pulling in one or two files a week.

Comment 3 billinghurst 2014-10-13 13:01:00 UTC

and https://en.wikisource.org/wiki/Help:DjVu_files#Method_3_-_pdf2djvu

Comment 4 Philippe Elie 2014-10-13 14:03:39 UTC

billinghurst, I installed djvudigital in /data/project/phetools to convert pdf to djvu, conversion fail in some rare case of half broken pdf but it's enough stable to use it. The script to use it is https://github.com/phil-el/phetools/blob/master/ocr/pdf_to_djvu.py I can help you on IRC to setup it.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links