Last modified: 2014-03-14 12:04:44 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T46097, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 44097 - make bits.wikimedia.org work with second level domains / A records.
make bits.wikimedia.org work with second level domains / A records.
Status: NEW
Product: Wikimedia
Classification: Unclassified
Apache configuration (Other open bugs)
wmf-deployment
All All
: Normal minor (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: 41847
  Show dependency treegraph
 
Reported: 2013-01-18 10:40 UTC by Daniel Kinzler
Modified: 2014-03-14 12:04 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Daniel Kinzler 2013-01-18 10:40:03 UTC
When trying to make wikidata.org work without the leading www (bug 41847), it was discovered that bits.wikimedia.org only works with CNAME domains. The relevant comment (<https://bugzilla.wikimedia.org/show_bug.cgi?id=41847#c12>): 

> > We can't _not_ have www because of bits.
> 
> Can you explain further? Plenty of wikis don't use "www"
> (commons.wikimedia.org, wikimediafoundation.org, etc.). I don't understand
> the
> issue.

<mutante> we cant NOT have www due to the way bits works
<Reedy_> why? :/
<mutante> due to the way bits works and geolocation
<mutante> it needs a CNAME .. and NOT an A record
<mutante> but wikidata.org is an A record
<mutante> wikimediafoundation.org only works because it is not balanced between
data centers at all
Comment 1 Roan Kattouw 2013-01-18 10:54:55 UTC
There are two separate issues here:

* "due to the way bits works"
** I don't know what mutante is referring to there
* "geolocation"
** we use CNAMEs to do geolocation, and wikidata.org can't be a CNAME, so second-level domains cannot be geographically load-balanced

As an illustration, en.wikipedia.org resolves in a chain, as follows:
$ host en.wikipedia.org
en.wikipedia.org is an alias for wikipedia-lb.wikimedia.org.
wikipedia-lb.wikimedia.org is an alias for wikipedia-lb.eqiad.wikimedia.org.
wikipedia-lb.eqiad.wikimedia.org has address 208.80.154.225
wikipedia-lb.eqiad.wikimedia.org has IPv6 address 2620:0:861:ed1a::1

Note that wikipedia-lb.wikimedia.org is a CNAME pointing to wikipedia-lb.eqiad.wikimedia.org (eqiad is the Virginia datacenter). The target of this CNAME will vary depending on where you are located. Running the same command from Europe, the same CNAME points to esams (Amsterdam) instead:
$ host en.wikipedia.org
en.wikipedia.org is an alias for wikipedia-lb.wikimedia.org.
wikipedia-lb.wikimedia.org is an alias for wikipedia-lb.esams.wikimedia.org.
wikipedia-lb.esams.wikimedia.org has address 91.198.174.225
wikipedia-lb.esams.wikimedia.org has IPv6 address 2620:0:862:ed1a::1

This is how we do geographic load balancing: the CNAMEs wiki$project-lb.wikimedia.org point to wiki$project-lb.$location.wikimedia.org , where $location depends on who's asking and where we believe they are.

But to make this work, the domain names used by humans (like en.wikipedia.org) have to point to one of these -lb.wikimedia.org CNAMEs, which means they themselves have to be CNAMEs. And second-level domains like wikidata.org cannot be CNAMEs, so this strategy falls apart:

$ host www.wikidata.org
www.wikidata.org is an alias for wikidata-lb.wikimedia.org.
wikidata-lb.wikimedia.org is an alias for wikidata-lb.eqiad.wikimedia.org.
wikidata-lb.eqiad.wikimedia.org has address 208.80.154.242
wikidata-lb.eqiad.wikimedia.org has IPv6 address 2620:0:861:ed1a::12

$ host wikidata.org
wikidata.org has address 208.80.152.218

I'm not sure whether it's possible to do geographic load balancing for second-level domains with some other approach that doesn't use CNAMEs, but our current approach won't work in this case.
Comment 2 MZMcBride 2013-01-18 14:54:01 UTC
Something is simply amiss here. There's no reason I can see that www.mediawiki.org/mediawiki.org would work fine, but that www.wikidata.org/wikidata.org would not work fine. The canonical URL form should be irrelevant.

I've asked Daniel K. to investigate the response codes at bug 41847 comment 16, as the response codes are currently fucking up my testing. Once the response codes are sane (across Wikimedia wikis), we can debug this bug and bug 41847 further.
Comment 3 Daniel Kinzler 2013-01-18 16:32:40 UTC
(In reply to comment #1)
> But to make this work, the domain names used by humans (like
> en.wikipedia.org)
> have to point to one of these -lb.wikimedia.org CNAMEs, which means they
> themselves have to be CNAMEs. And second-level domains like wikidata.org
> cannot
> be CNAMEs, so this strategy falls apart:

Why caqn't second level domains not be CNAMEs? I coudn't find anything in the RFC that supports this. If there's a CNAME for a domain, there should be no other DNS records for it - true. Is that the problem? That for second level domains, there are always other DNS records, because of how the registrar handles things?
Comment 4 Kevin Israel (PleaseStand) 2013-02-03 04:41:38 UTC
(In reply to comment #3)
> If there's a CNAME for a domain, there should be no
> other DNS records for it - true. Is that the problem? That for second level
> domains, there are always other DNS records, because of how the registrar
> handles things?

Well, you do want www.wikidata.org to work (not just wikidata.org), right? So you would need an NS record for wikidata.org, which can't be there along with a CNAME.
Comment 5 Andre Klapper 2013-10-31 12:17:27 UTC
[replacing wikidata keyword by adding CC - see bug 56417]

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links