Last modified: 2012-02-27 19:51:36 UTC
I haven't found a reliable way to reproduce this but every now and then I see the following on a mobile phone. "Scripts should be use an informative User-Agent string with contact information, or they may be IP-blocked without notice”.
Is this specific to any particular device, or widespread? Sounds like the err we give on things when no User-Agent is given on a request: $ telnet en.m.wikipedia.org 80 Trying 208.80.154.236... Connected to m.wikimedia.org. Escape character is '^]'. GET /wiki/Foobar HTTP/1.0 Host: en.m.wikipedia.org HTTP/1.1 403 Forbidden Server: Apache X-Powered-By: PHP/5.2.4-2ubuntu5.12wm1 Vary: Accept-Encoding X-Vary-Options: Accept-Encoding;list-contains=gzip Content-Type: text/html X-Varnish: 1435406991 Via: 1.1 varnish X-Cache-Be: miss X-Device: html Date: Wed, 21 Sep 2011 18:45:39 GMT X-Varnish: 1772645340 Age: 0 Via: 1.1 varnish Connection: close X-Cache-Fe: miss Cache-Control: private, s-maxage=0, max-age=0, must-revalidate Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice. Connection closed by foreign host. I think this is checked and enforced from our PHP-based config files for MediaWiki... if it's getting triggered, then perhaps either some requests have an empty or missing user-agent header, or there's some bizarre cache collision with something that does...?
I've seen this both on the iOS simulator shipped with Xcode 4.1 and on a Nexus one (2.3.4) running the native browser.
That you get it in the iOS simulator is good news -- you should be able to run WireShark or something and just capture traffic until it happens, then can at least check what the actual HTTP request & response is (at least at the client's end!)
If you try to load http://DE.M.WIKIPEDIA.ORG (notice all uppercase) on an Android based device the request gives the unknown wiki response. But, after this request if you navigate to http://de.m.wikipedia.org it causes the, "Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice." error message to be displayed in the cached response.
Are we not setting a legitimate user agent?
(In reply to comment #5) > Are we not setting a legitimate user agent? The user-agent should be sent from the mobile varnish server to the backend mediawiki server the same as it was set in the original request with no modifications.
The base of the 403 behavior is the same via varnish to en.m.wikipedia.org as it with the squids via en.wikipedia.org. If something is in cache, it doesn't need a useragent. If something isn't in cache and goes back to the apaches, you get a 403. Mobile, or non-mobile, squid or varnish. The problem behavior though was due to a combination of varnish caching the 403 response and the way we vary on device type using our internal x-device header. No user-agent results in x-device mapping to the default which is 'html'. We vary on that, and any browser that also maps to the default device type is liable to get a cached 403 for that url. We could set x-device to "empty" if there isn't a UA, which would result in cached 403's for non-UA requests, but ensure that reqs with a UA don't cache hit on that. Squid just doesn't cached that error, so I've made varnish replicate that, resolving the issue. Verification: meh:~ asher$ telnet en.m.wikipedia.org 80 Trying 208.80.154.236... Connected to m.wikimedia.org. Escape character is '^]'. GET /wiki/CatsfdsfDfa HTTP/1.1 Host: en.m.wikipedia.org HTTP/1.1 403 Forbidden Server: Apache X-Powered-By: PHP/5.3.2-2wm1 X-Content-Type-Options: nosniff Vary: Accept-Encoding X-Vary-Options: Accept-Encoding;list-contains=gzip Content-Type: text/html Accept-Ranges: bytes X-Varnish: 358175687 Age: 0 Via: 1.1 varnish X-Cache: miss (0) Cache-Control: private, s-maxage=0, max-age=0, must-revalidate X-Device: html Content-Length: 120 Accept-Ranges: bytes Date: Wed, 26 Oct 2011 00:55:48 GMT X-Varnish: 1320645677 Age: 0 Via: 1.1 varnish Connection: keep-alive X-Cache-frontend: miss (0) Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice. Same request, with a garbage UA that also maps to X-Device: html meh:~ asher$ telnet en.m.wikipedia.org 80 Trying 208.80.154.236... Connected to m.wikimedia.org. Escape character is '^]'. GET /wiki/CatsfdsfDfa HTTP/1.1 Host: en.m.wikipedia.org User-Agent: bogus HTTP/1.1 404 Not Found Server: Apache X-Powered-By: PHP/5.3.2-2wm1 X-Content-Type-Options: nosniff Content-language: en Vary: Accept-Encoding,Cookie,X-Device X-Vary-Options: Accept-Encoding;list-contains=gzip,Cookie;string-contains=enwikiToken;string-contains=enwikiLoggedOut;string-contains=enwiki_session;string-contains=centralauth_Token;string-contains=centralauth_Session;string-contains=centralauth_LoggedOut,X-Device; Content-Type: text/html; charset=UTF-8 Accept-Ranges: bytes ...