Last modified: 2014-10-16 12:08:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T49405, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 47405 - Get/unpack only a part of a content


Summary:	Get/unpack only a part of a content

Status:	NEW

Product:	openZIM
Classification:	Unclassified
Component:	zimlib (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Lowest enhancement
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2013-04-19 12:31 UTC by Kelson [Emmanuel Engelhart]
Modified:	2014-10-16 12:08 UTC (History)
CC List:	0 users

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Kelson [Emmanuel Engelhart] 2013-04-19 12:31:37 UTC

The zimlib needs to fully unpack a content before giving delivering it to a
third part software.

This has many disadvantages especially if the content is big, a video for
example:
* This will need pretty much memory
* This will take time
* You do not have a random access (necessary to seek in an HTML5 video)

It would be a really good usability improvement to have a method which delivers
only a part of any content.

------- Comment #1 From Tommi Mäkitalo 2010-09-26 22:01:26 -------

The data is always fully uncompressed. There is no way to prevent that. LZMA2
uses 1MB chunks internally and it will always uncompress the whole chunk. If I
try to read only some bytes, LZMA2 will still uncompress the 1MB data and just
return part of it. So it won't save any space nor time if we read only a few
bytes.

The zimlib is designed to prevent unnecessary copies. So if you request an
article, the data is uncompressed and the article point directly to the
uncompressed data. This is one of the reasons, the data is not necessarily zero
terminated. Mostly after the last byte of an article you can find the first
byte of the next article.

The only situation, where it really may save some time is really for very large
articles. If a article data is much larger than 2MB it may take multiple LZMA2
chunks and if you really need only the first MB, you don't really need to
uncompress the whole chunk.

Zim is optimized for many small articles. And I feel, that this matches our
target. I don't think we really need to do optimization for something else.

------- Comment #2 From Emmanuel Engelhart 2010-09-27 08:38:45 -------

If we don't implement that feature, how do we want to display quickly, allow
seeking in a 10MB video or audio file? How do we allow a download window to
appear immediately? This is concrete problems I have, as a user, with my ZIM
files.

The good news is that not all the articles are compressed... In fact the
articles which need this quick random access are mainly not compressed
(audio/video/...) in the ZIM itself. So at least, 90% of what I think user need
would be reach with such a feature for non compressed content.

For the compressed articles, the problem occurs only with big articles which
need a big uncompress time and a big download time... and you already have
written about the solution in such cases.

/* This bug was migrated from the previous openZIM bug tracker */

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links