Last modified: 2014-03-06 13:43:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T64288, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 62288 - Random API issues - authenticated user seems unauthenticated randomly
Random API issues - authenticated user seems unauthenticated randomly
Status: RESOLVED DUPLICATE of bug 62244
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
wmf-deployment
All All
: Unprioritized major (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-05 23:32 UTC by Damian Z
Modified: 2014-03-06 13:43 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Damian Z 2014-03-05 23:32:53 UTC
I have 2 bots - ClueBot NG and ClueBot III - both using the same php class. These have previously run fine in pmtpa tools and where running fine in eqiad tools.

Since earlier today, api requests using a session id seem to fail a random amount of time - CBNG is missing maybe 20% of edits, III about 80%.

Tracing out the CURL requests - the session is kept, a successful login is done, but tokens are missing/the current user looks logged out.

A sample request:
* About to connect() to en.wikipedia.org port 80 (#0)
*   Trying 208.80.154.224... * connected
> POST /w/api.php?action=login&format=php HTTP/1.1
User-Agent: ClueBot/2.0
Host: en.wikipedia.org
Accept: */*
Accept-Encoding: deflate, gzip
Content-Length: 268
Content-Type: multipart/form-data; boundary=----------------------------0882348d6018

< HTTP/1.1 200 OK
< Server: Apache
< X-Powered-By: PHP/5.3.10-1ubuntu3.9+wmf1
< X-Content-Type-Options: nosniff
< Cache-control: private
< X-Frame-Options: SAMEORIGIN
< Content-Encoding: gzip
< Vary: Accept-Encoding
< X-Vary-Options: Accept-Encoding;list-contains=gzip
< Content-Type: application/vnd.php.serialized; charset=utf-8
< X-Varnish: 2852251556, 1052575718
< Via: 1.1 varnish, 1.1 varnish
< Content-Length: 162
< Accept-Ranges: bytes
< Date: Wed, 05 Mar 2014 23:27:10 GMT
< Age: 0
< Connection: keep-alive
< X-Cache: cp1066 miss (0), cp1068 frontend miss (0)
* Added cookie enwikiSession="< REMOVED >" for domain en.wikipedia.org, path /, expire 0
< Set-Cookie: enwikiSession=< REMOVED >; path=/; HttpOnly; GeoIP=::::v4; path=/
< 
* Closing connection #0
POST: http://en.wikipedia.org/w/api.php?action=login&format=php (0.065814018249512 s) (194 b)
Array
(
    [login] => Array
        (
            [result] => NeedToken
            [token] => < REMOVED >
            [cookieprefix] => enwiki
            [sessionid] => < REMOVED >
        )

)
* About to connect() to en.wikipedia.org port 80 (#0)
*   Trying 208.80.154.224... * connected
> POST /w/api.php?action=login&format=php HTTP/1.1
User-Agent: ClueBot/2.0
Host: en.wikipedia.org
Accept: */*
Accept-Encoding: deflate, gzip
Cookie: enwikiSession=< REMOVED >
Content-Length: 396
Content-Type: multipart/form-data; boundary=----------------------------ce92f93b040d

< HTTP/1.1 200 OK
< Server: Apache
< X-Powered-By: PHP/5.3.10-1ubuntu3.9+wmf1
< X-Content-Type-Options: nosniff
< Cache-control: private
< P3P: CP="This is not a P3P policy! See http://en.wikipedia.org/wiki/Special:CentralAutoLogin/P3P for more info."
< X-Frame-Options: SAMEORIGIN
< Content-Encoding: gzip
< Vary: Accept-Encoding
< X-Vary-Options: Accept-Encoding;list-contains=gzip
< Content-Type: application/vnd.php.serialized; charset=utf-8
< X-Varnish: 2852251622, 1518781136
< Via: 1.1 varnish, 1.1 varnish
< Content-Length: 205
< Accept-Ranges: bytes
< Date: Wed, 05 Mar 2014 23:27:10 GMT
< Age: 0
< Connection: keep-alive
< X-Cache: cp1066 miss (0), cp1067 frontend miss (0)
* Added cookie centralauth_User="ClueBot+III" for domain wikipedia.org, path /, expire 1396654030
< Set-Cookie: centralauth_User=ClueBot+III; expires=Fri, 04-Apr-2014 23:27:10 GMT; path=/; domain=.wikipedia.org; httponly; GeoIP=::::v4; path=/
< 
* Closing connection #0
POST: http://en.wikipedia.org/w/api.php?action=login&format=php (0.11309099197388 s) (256 b)
Array
(
    [login] => Array
        (
            [result] => Success
            [lguserid] => < REMOVED >
            [lgusername] => ClueBot III
            [lgtoken] => < REMOVED >
            [cookieprefix] => enwiki
            [sessionid] => < REMOVED >
        )

)
* About to connect() to en.wikipedia.org port 80 (#0)
*   Trying 208.80.154.224... * connected
> GET /w/api.php?action=query&prop=revisions&titles=User%3ADamianZaremba&rvlimit=1&rvprop=timestamp|ids|user|comment&format=php&meta=userinfo&rvdir=older HTTP/1.1
User-Agent: ClueBot/2.0
Host: en.wikipedia.org
Accept: */*
Accept-Encoding: deflate, gzip
Cookie: centralauth_User=ClueBot+III; enwikiSession=< REMOVED >

< HTTP/1.1 200 OK
< Server: Apache
< X-Powered-By: PHP/5.3.10-1ubuntu3.9+wmf1
< X-Content-Type-Options: nosniff
< Cache-control: private
< X-Frame-Options: SAMEORIGIN
< Content-Encoding: gzip
< Vary: Accept-Encoding
< X-Vary-Options: Accept-Encoding;list-contains=gzip
< Content-Type: application/vnd.php.serialized; charset=utf-8
< X-Varnish: 2479625599, 1518781362
< Via: 1.1 varnish, 1.1 varnish
< Content-Length: 277
< Accept-Ranges: bytes
< Date: Wed, 05 Mar 2014 23:27:11 GMT
< Age: 0
< Connection: keep-alive
< X-Cache: cp1053 miss (0), cp1067 frontend miss (0)
* Added cookie GeoIP="::::v4" for domain en.wikipedia.org, path /, expire 0
< Set-Cookie: GeoIP=::::v4; path=/
< 
* Closing connection #0
GET: http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=User%3ADamianZaremba&rvlimit=1&rvprop=timestamp|ids|user|comment&format=php&meta=userinfo&rvdir=older (0.083457946777344 s) (475 b) (200 code)
Array
(
    [0] => Array
        (
            [revid] => 510098058
            [parentid] => 509798527
            [user] => DamianZaremba
            [timestamp] => 2012-08-31T11:46:49Z
            [comment] => 
        )

    [ns] => 2
    [title] => User:DamianZaremba
    [currentuser] => 10.68.16.7
    [continue] => 
    [pageid] => 29562889
)


All the values and return looks good - but the current user is anon and any actions using tokens or doing privileged actions fail.

This is the same across multiple bot users, all hitting the enwiki api from tools in eqiad.

Also tested from pmtpa to rule out migration issues and this issue is still present - it was not last week.

Possibly related: The internal labs IPs are now the anon ips, rather than the external - I believe it use to be the external, but apparently it shouldn't matter.

The varnish header suggests this is getting past the cache ok and fails 'randomly', so I'm having a hard time tracking down the issue.

Others have mentioned it as per http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#Possible_bot_malfunctioning.3F and other 'lost' sessions.

Maybe some migration related issue outstanding?

Hopefully others can add more clarity.
Comment 1 Brad Jorsch 2014-03-06 03:43:37 UTC
> < HTTP/1.1 200 OK
> < Server: Apache
> < X-Powered-By: PHP/5.3.10-1ubuntu3.9+wmf1
> < X-Content-Type-Options: nosniff
> < Cache-control: private
> < P3P: CP="This is not a P3P policy! See
> http://en.wikipedia.org/wiki/Special:CentralAutoLogin/P3P for more info."
> < X-Frame-Options: SAMEORIGIN
> < Content-Encoding: gzip
> < Vary: Accept-Encoding
> < X-Vary-Options: Accept-Encoding;list-contains=gzip
> < Content-Type: application/vnd.php.serialized; charset=utf-8
> < X-Varnish: 2852251622, 1518781136
> < Via: 1.1 varnish, 1.1 varnish
> < Content-Length: 205
> < Accept-Ranges: bytes
> < Date: Wed, 05 Mar 2014 23:27:10 GMT
> < Age: 0
> < Connection: keep-alive
> < X-Cache: cp1066 miss (0), cp1067 frontend miss (0)
> * Added cookie centralauth_User="ClueBot+III" for domain wikipedia.org, path
> /, expire 1396654030
> < Set-Cookie: centralauth_User=ClueBot+III; expires=Fri, 04-Apr-2014
> 23:27:10 GMT; path=/; domain=.wikipedia.org; httponly; GeoIP=::::v4; path=/
> < 
> * Closing connection #0

There should be additional cookies being set here: centralauth_Token, centralauth_Session, probably forceHTTPS, and also enwikiUserID and enwikiUserName.

The one Set-Cookie header here looks weird too. Why the additional "; GeoIP=::::v4; path=/" after the cookie?

I bet this has something to do with the new GeoIP stuff they did recently, it smells like their code is screwing up any existing Set-Cookie headers in the response.
Comment 2 Ori Livneh 2014-03-06 08:42:55 UTC
What is wrong with the way the GeoIP cookie is added to the set-cookie header in the example above?
Comment 3 Antoine "hashar" Musso (WMF) 2014-03-06 09:52:11 UTC
We have the exact same issue on the beta cluster, logged as bug 62244. Copy pasting from there:

-----------------
On deployment-cache-text1.pmtpa.wmflabs I edited /etc/varnish/text-frontend.inc.vcl and commented out the GeoIP snippet added by change 113935:

https://gerrit.wikimedia.org/r/#/c/113935/6/templates/varnish/text-frontend.inc.vcl.erb,unified ) 

Restarted varnish with:

 /etc/init.d/varnish restart ; /etc/init.d/varnish-frontend restart

I then ran Chris script and I received three cookies:


< Set-Cookie: centralauth_Session=3428f185da995eaaf9550ae54b4e634f; path=/; httponly
< Set-Cookie: centralauth_User=TestBug16864; expires=Sat, 05-Apr-2014 09:34:59 GMT; path=/; httponly
< Set-Cookie: centralauth_Token=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; httponly
< Location: http://en.wikipedia.beta.wmflabs.org/wiki/Special:CentralLogin/complete?token=ef3afb1dbc82549c1655b6841d8614ae
< Age: 0
< X-Cache: deployment-cache-text1 miss (0), deployment-cache-text1 frontend miss (0)
< Cache-Control: private, s-maxage=0, max-age=0, must-revalidate

-----------------


Your bug report seems to indicate that happens in production as well.
Comment 4 Antoine "hashar" Musso (WMF) 2014-03-06 10:00:52 UTC
This might have been fixed by Faidon with: https://gerrit.wikimedia.org/r/#/c/117004/ which "Disable cookie-based geolocation on text varnishes".


Can you please try again?
Comment 5 Brad Jorsch 2014-03-06 12:04:04 UTC
(In reply to Ori Livneh from comment #2)
> What is wrong with the way the GeoIP cookie is added to the set-cookie
> header in the example above?

You mean for example in this header:

 Set-Cookie: centralauth_User=ClueBot+III; expires=Fri, 04-Apr-2014 23:27:10 GMT; path=/; domain=.wikipedia.org; httponly; GeoIP=::::v4; path=/

It's not adding a GeoIP cookie at all, according to RFC 6265. It's adding to extra parameters to the end of the centralauth_User cookie.

The more important issue is that the other six Set-Cookie headers that should have been in the response are entirely missing.
Comment 6 Brad Jorsch 2014-03-06 12:07:03 UTC
(In reply to Antoine "hashar" Musso from comment #4)
> This might have been fixed by Faidon with:
> https://gerrit.wikimedia.org/r/#/c/117004/ which "Disable cookie-based
> geolocation on text varnishes".
> 
> 
> Can you please try again?

The issue is currently not occurring with the same bot script that was showing the issue last night. I observe that the Set-Cookie headers do not have "; GeoIP=::::v4; path=/" appended.
Comment 7 Damian Z 2014-03-06 13:40:51 UTC
I can confirm that change being pulled out has fixed the bots. Been happily editing ever since and upon restart get all the cookies etc set properly as described above.
Comment 8 Antoine "hashar" Musso (WMF) 2014-03-06 13:43:20 UTC
This was fixed around 9:30am UTC by Faidon who disabled a Varnish feature deployed yesterday. Details at https://bugzilla.wikimedia.org/show_bug.cgi?id=62244#c19

I am marking this bug as a dupe of bug 62244 since that is where we have the technical details and script to reproduce the issue.

*** This bug has been marked as a duplicate of bug 62244 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links