Last modified: 2014-11-17 09:21:21 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T42779, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 40779 - Seeking in video files (range requests) does not work
Seeking in video files (range requests) does not work
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: High major with 1 vote (vote)
: ---
Assigned To: Mark Bergsma
:
Depends on: 39016
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-04 21:52 UTC by Erik Moeller
Modified: 2014-11-17 09:21 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Erik Moeller 2012-10-04 21:52:23 UTC
As a follow-up to bug 36577, now that Varnish has replaced Squid for upload.wikimedia.org, we have another remaining major issue. Seeking in video files does no longer work, i.e. the browser does not indicate the length of the video or make seeking features available (tested w/ Firefox). This appears to apply to all videos in production right now, and is apparently due to lack of support for range requests.

This is a major regression and we should (again) switch back if we can't fix it within reasonable time.

To compare Squid/Varnish behavior, add IP of upload-lb.pmtpa.wikimedia.org to your /etc/hosts file as IP for upload.wikimedia.org. Example with the current IPv4 address:

208.80.152.211  upload.wikimedia.org
Comment 1 Rob Lanphier 2012-10-11 23:49:28 UTC
Assigning to Aaron, with the specific task of adding MediaWiki support for having different caching servers based on whether or not streaming support is needed (in our case, probably Varnish for non-streaming, and Squid for streaming).
Comment 2 Erik Moeller 2012-10-11 23:53:54 UTC
Note there's also bug 39016, which suggests to distinguish by file _size_. Perhaps that one should be closed as invalid since Varnish isn't a good option even for small files without range support.
Comment 3 Rob Lanphier 2012-10-12 00:20:28 UTC
(In reply to comment #2)
> Note there's also bug 39016, which suggests to distinguish by file _size_.
> Perhaps that one should be closed as invalid since Varnish isn't a good option
> even for small files without range support.

Heh, oh yeah!  I think that one is close enough that we can use it as a more specific tracker.
Comment 4 Aaron Schulz 2012-10-12 05:02:34 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Note there's also bug 39016, which suggests to distinguish by file _size_.
> > Perhaps that one should be closed as invalid since Varnish isn't a good option
> > even for small files without range support.
> 
> Heh, oh yeah!  I think that one is close enough that we can use it as a more
> specific tracker.

I think he was saying we should close that and instead make the URL based on the type of file perhaps. It makes the most sense for timed formats.
Comment 5 Mark Bergsma 2012-10-17 15:52:17 UTC
Here is my attempt to add Range support to Varnish, when it's streaming files:

https://gerrit.wikimedia.org/r/#/c/28361/1

I think this approach will work, but I just heard there's been an independent attempt to add this as well, so I'll sync up with that first. Either way, we should be able to get Varnish to behave like Squid does now... that is not optimal, but at least it won't be worse either.

Despite all this, if we want to be serious about video support, we'll need to setup some entirely different infrastructure, probably using http://en.wikipedia.org/wiki/HTTP_Live_Streaming 

Using the current static file caching infrastructure is just not a good way to handle large videos, especially not when they become popular.
Comment 6 Rob Lanphier 2012-10-18 01:40:49 UTC
(In reply to comment #5)
> Here is my attempt to add Range support to Varnish, when it's streaming files:
> https://gerrit.wikimedia.org/r/#/c/28361/1

Cool!

> Despite all this, if we want to be serious about video support, we'll need to
> setup some entirely different infrastructure, probably using
> http://en.wikipedia.org/wiki/HTTP_Live_Streaming 

HTTP Live Streaming is not for streaming static files.  It's for live audio/video broadcasting, where there's no "start" and "end" to the file for caching purposes.  It *might* simplify caching, but it will make the client and server way more complicated than they probably need to be.

I think we're best off pushing for increasingly better byte range support in Varnish.  Alternatively, we could beef up our Swift (or whatever) infrastructure so that it can deal with the load without having a caching proxy in front.
Comment 7 Mark Bergsma 2012-10-19 15:04:25 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Here is my attempt to add Range support to Varnish, when it's streaming files:
> > https://gerrit.wikimedia.org/r/#/c/28361/1
> 
> Cool!
> 
> > Despite all this, if we want to be serious about video support, we'll need to
> > setup some entirely different infrastructure, probably using
> > http://en.wikipedia.org/wiki/HTTP_Live_Streaming 
> 
> HTTP Live Streaming is not for streaming static files.  It's for live
> audio/video broadcasting, where there's no "start" and "end" to the file for
> caching purposes.  It *might* simplify caching, but it will make the client and
> server way more complicated than they probably need to be.

Sure, but it can of course also be used when start/end are available.
 
> I think we're best off pushing for increasingly better byte range support in
> Varnish.  Alternatively, we could beef up our Swift (or whatever)
> infrastructure so that it can deal with the load without having a caching proxy
> in front.

Well, neither Squid nor Varnish support partial object caching, which is what's really needed to make this work and perform well. Since it's complex, it won't happen anytime soon in Varnish, and the fact that it appears to be patented, doesn't help at all, either.

In any case, I've got my implementation of Range requests with streaming working in Varnish now:

https://gerrit.wikimedia.org/r/#/c/28379/2

So that will allow us to make Varnish work like Squid now. We'll see how this goes.
Comment 8 Mark Bergsma 2012-10-22 15:57:21 UTC
I've deployed this range requests with streaming work on all Varnish upload caches, and it appears to work well so far. Delivery of the object starts as soon as the first byte of the range is available/retrieved, on a miss.
Comment 9 Mark Bergsma 2012-10-23 13:54:06 UTC
It was reported that seeking still wasn't working, because Firefox didn't get (and display) the video length.

I found an additional problem: due to our 2-tier Varnish setup (frontend/backend instances), Varnish was sending some headers in duplicate as it's not properly filtering. This included the Content-Length header, Accept-Ranges, Age and Via.

I first suspected the Content-Length header being the problem, and fixed this problem in Varnish, only happening while streaming:

https://gerrit.wikimedia.org/r/#/c/29565/

But this didn't fix the issue with Firefox. I've now added some hacky VCL code to remove the duplicate Accept-Ranges header, and this turned out to be the culprit.

It's fixed by this workaround for now, but the proper fix would be to make Varnish filter headers appropriately. The remaining duplicates don't seem to cause issues right now, but it's hard to know for sure.
Comment 10 Mark Bergsma 2012-10-23 16:30:42 UTC
I've now fixed this properly inside Varnish itself, by filtering out headers at the deliver stage, before it adds them unconditionally. The VCL hack is no longer needed.

Note that https://bugzilla.wikimedia.org/show_bug.cgi?id=41304 will help eliminating the need for a lot of Range requests as well, and improve performance a lot.
Comment 11 Mark Bergsma 2012-10-24 11:45:58 UTC
I just fixed another issue. A typical request pattern for Firefox is:

#1 Request the entire video object (either as a normal request or as a Range: bytes=0- request).
#2 Using the Content-Length information from #1, Retrieve the end of the video to determine the video duration, as a Range request.

If #1 is a miss on the backend Varnish instance, it will start fetching the entire file. On a big file this can take a while, say 30s or more. If #2 comes in before the requested range is available, it would queue up and wait for #1 and thus also wait 30s.

I've now set req.hash_ignore_busy for high range requests, so cache hits continue to work, but cache hits on *busy* objects (i.e. objects currently being retrieved) do not. Thus, high range requests will go straight to Swift.
Comment 12 Rob Lanphier 2012-11-02 22:21:14 UTC
This one seems to be fixed.  Bug 41304 still has a couple of patches pending review (X-Content-Duration), and I think there's likely to be some iteration on it, but the core issue of range support appears to be solved.  Thanks Mark!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links