Last modified: 2014-07-08 19:42:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T69625, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 67625 - DNS resolution on the virtual host is painfully slow
DNS resolution on the virtual host is painfully slow
Status: RESOLVED FIXED
Product: MediaWiki-Vagrant
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: High normal
: ---
Assigned To: Nik Everett
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-07-07 20:25 UTC by Nik Everett
Modified: 2014-07-08 19:42 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nik Everett 2014-07-07 20:25:43 UTC
The CirrusSearch tests use http://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCirrusSearch.git/d71770acb38d10cc2ef482df1452d39e8be3dfa2/tests%2Fbrowser%2Farticles%2Frashidun_caliphate.txt as an example of a reasonably complex page.  They import the wikitext for the page without importing any of the templates or requiring scribunto, both of which would be required to get it to look right.  Anyway, the page takes many many seconds for hhvm to render in vagrant.  So many that the request times out.  Eventually enough of the page is cached and requests for it are fast.
Comment 1 Nik Everett 2014-07-08 15:37:54 UTC
I tried this again this morning and it seems like hhvm gets starved for threads somehow.  Other requests start failing when this happens.  At least, that is the case on vagrant.

Also, with the job queue bouncing against hhvm I think this can also cause some starvation.
Comment 2 Nik Everett 2014-07-08 17:41:34 UTC
I'm also not sure now if the large page is the trouble or the starvation caused by having the large page hit the job queue as well.
Comment 3 Nik Everett 2014-07-08 18:02:07 UTC
I added timeouts to the end of the access logs.  This look unhealthy:
 - - [08/Jul/2014:17:57:25 +0000] \”POST /w/index.php?title=Special%3ARunJobs&tasks=jobs&maxjobs=100&sigexpiry=1404841997&signature=83d9513fa827c7c762ba47fd28b9b2866266e8aa HTTP/1.1\” 202 - 252802314

That's 253ish seconds to run jobs.  I'm dropping my jobrunrate from the stupid high 100 to 5 to see if that helps.  I don't imagine it will, though, because I don't make that many jobs at once.

Also, this:
 - - [08/Jul/2014:17:59:34 +0000] \”POST /wiki/../w/api.php HTTP/1.1\” 200 116 125473169

That is posting my large page, I believe.  About 125 seconds.
Comment 4 Nik Everett 2014-07-08 18:08:02 UTC
Clicking the "edit" button on my large page took 28 seconds in browser but was logged as taking 8.6 in the access log....

Posting the page timed out at the apache level after two minutes and took 259 seconds for hhvm to finish.  Off to the profiler.
Comment 5 Nik Everett 2014-07-08 18:16:00 UTC
Actually, its caused by instant commons, I believe.  It looks like it tries to fetch everything as soon as the page is created.  Tries over and over again.
Comment 7 Nik Everett 2014-07-08 18:47:25 UTC
Its a vagrant problem.  Inside vagrant:

$ time curl google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

real	0m5.704s
user	0m0.001s
sys	0m0.030s
vagrant@mediawiki-vagrant:/srv/images$ 



Outside vagrant:
$ time curl google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

real	0m0.183s
user	0m0.014s
sys	0m0.009s
Comment 8 Chad H. 2014-07-08 18:50:05 UTC
Connection problems? Sounds like what I was hitting yesterday / this morning. Still don't have a solution short of reloading the VM and hoping.
Comment 9 Nik Everett 2014-07-08 18:58:04 UTC
Not connection problems - its dns resolution!  I have a fix but it is somewhat mysterious.
Comment 10 Gerrit Notification Bot 2014-07-08 19:02:06 UTC
Change 144738 had a related patch set uploaded by Manybubbles:
Speed up dns resolution in some cases

https://gerrit.wikimedia.org/r/144738
Comment 11 Bryan Davis 2014-07-08 19:30:54 UTC
Possibly related to https://www.virtualbox.org/ticket/11663 and a lot of other VirtualBox host-only netowrking problems listed as see-alsos on that bug.
Comment 12 Nik Everett 2014-07-08 19:32:41 UTC
(In reply to Bryan Davis from comment #11)
> Possibly related to https://www.virtualbox.org/ticket/11663 and a lot of
> other VirtualBox host-only netowrking problems listed as see-alsos on that
> bug.

Sounds likely.
Comment 13 Gerrit Notification Bot 2014-07-08 19:33:56 UTC
Change 144738 merged by jenkins-bot:
Speed up dns resolution in some cases

https://gerrit.wikimedia.org/r/144738
Comment 14 Bryan Davis 2014-07-08 19:42:36 UTC
Don't forget to restart the VirtualBox guest after applying the patch: `vagrant reload` works well.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links