Last modified: 2014-11-09 20:32:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73121, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 71121 - RepeatingGenerator intermittent failure on test.wikidata
RepeatingGenerator intermittent failure on test.wikidata
Status: RESOLVED FIXED
Product: Pywikibot
Classification: Unclassified
pagegenerators (Other open bugs)
core-(2.0)
All All
: High normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-09-22 14:42 UTC by John Mark Vandenberg
Modified: 2014-11-09 20:32 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description John Mark Vandenberg 2014-09-22 14:42:24 UTC
We have seen a few intermittent failures of RepeatingGenerator.  The most recent is on test.wikidata:

https://travis-ci.org/wikimedia/pywikibot-core/jobs/35913559

IIRC, the previous ones have also been test.wikidata

My guess is there is insufficient recent change data on this wiki, and a looping problem is causing infinite looping.
Comment 1 Sorawee Porncharoenwase 2014-09-22 16:20:40 UTC
The easiest workaround is to remove the test, and when we come up with a better test that guarantees that it won't cause a failure in any case, we can add it later.

Another workaround is to simulate a stream of recentchanges / newpages somehow to prevent insufficient recent change data.
Comment 2 Sorawee Porncharoenwase 2014-09-22 16:22:23 UTC
If it's very urgent, you can remove the test right now. I can't code in next few days.
Comment 3 John Mark Vandenberg 2014-10-05 04:33:40 UTC
It isnt urgent.  It only happens occasionally, and only happens on one site.  Another one today.
https://travis-ci.org/wikimedia/pywikibot-core/jobs/37035200

(In reply to Sorawee Porncharoenwase from comment #1)
> The easiest workaround is to remove the test, and when we come up with a
> better test that guarantees that it won't cause a failure in any case, we
> can add it later.
> 
> Another workaround is to simulate a stream of recentchanges / newpages
> somehow to prevent insufficient recent change data.

The best 'quick' way to do this is move the test into a new class, and then in setUpClass skip the tests if there are not sufficient suitable recentchanges / newpages for the test to run against the live wiki.
Comment 4 Sorawee Porncharoenwase 2014-10-25 18:26:51 UTC
@John Mark Vandenberg: class TestPageGenerators runs with family = 'wikipedia', code = 'en', doesn't it? Why should it run on test.wikidata?

Anyway, suppose that this bug really need to be fixed:

(In reply to John Mark Vandenberg from comment #3)
>
> The best 'quick' way to do this is move the test into a new class, and then
> in setUpClass skip the tests if there are not sufficient suitable
> recentchanges / newpages for the test to run against the live wiki.

This is impossible because we don't know the future!

One way I can think of is that we might use `multiprocessing` to run the test while the main process wait for, say, fifteen seconds. If the process has not finished yet by that time, we terminate the process and assume that it works correctly.
Comment 5 John Mark Vandenberg 2014-10-26 04:42:50 UTC
(In reply to Sorawee Porncharoenwase from comment #4)
> @John Mark Vandenberg: class TestPageGenerators runs with family =
> 'wikipedia', code = 'en', doesn't it? Why should it run on test.wikidata?

You're right; the test is supposed to be running against the site en.wikipedia.org, but a bug somewhere in pywikibot could mean that doesnt happen.

I find it hard to believe that en.wp doesnt have four namespace 0 edits on recentchanges for 10 minutes.  In fact, 10 mins shouldnt even be required.  If I understand correctly, this test is essentially asking the RC feed for four namespace 0 edits, any time in the past.  This should always be an instant result.
Comment 6 Sorawee Porncharoenwase 2014-10-26 05:23:52 UTC
(In reply to John Mark Vandenberg from comment #5)
> recentchanges for 10 minutes.  In fact, 10 mins shouldnt even be required. 
> If I understand correctly, this test is essentially asking the RC feed for
> four namespace 0 edits, any time in the past.  This should always be an
> instant result.

This is wrong. RepeatingGenerator will ask the RC feed for the latest edit in the past (to be an indicator of "present") and three more edits in the future. Thus, it might not return an instant result for some sites. For English Wikipedia, however, it should return an instant result.
Comment 7 John Mark Vandenberg 2014-10-26 05:40:44 UTC
OK. thanks for clarifying; it makes more sense now, but still doesnt explain how it might take 10 mins to fetch 3 namespace 0 edits in enwp.

One way to avoid the problem is to add a timeout to RepeatingGenerator, so the caller can prevent it from locking up forever if new data doesnt arrive.
Comment 8 Sorawee Porncharoenwase 2014-10-29 14:19:24 UTC
The most recent one now is on ar.wikipedia: https://travis-ci.org/wikimedia/pywikibot-core/builds/39342240
Comment 9 John Mark Vandenberg 2014-10-31 05:24:55 UTC
Another test.wd hang: https://travis-ci.org/wikimedia/pywikibot-core/jobs/39560103
Comment 11 John Mark Vandenberg 2014-11-07 13:57:04 UTC
ar.wp
https://travis-ci.org/wikimedia/pywikibot-core/jobs/40288852
Comment 12 Sorawee Porncharoenwase 2014-11-07 14:02:42 UTC
This weekend I will add a parameter "timeout." It's not an elegant solution, though, because it would be just a workaround -- hiding the real problem without fixing it. Some people might even disagree with this workaround.
Comment 13 Fabian 2014-11-07 14:06:45 UTC
Couldn't we print additional information instead? I'd prefer that first to determine who's fault it is (if there are really only so few edits). For example start time would be interesting and if it had fetch any pages.
Comment 14 Gerrit Notification Bot 2014-11-07 14:37:43 UTC
Change 171830 had a related patch set uploaded by John Vandenberg:
Disable cache for RepeatingGenerator tests

https://gerrit.wikimedia.org/r/171830
Comment 15 Gerrit Notification Bot 2014-11-07 15:02:03 UTC
Change 171830 merged by jenkins-bot:
Disable cache for RepeatingGenerator tests

https://gerrit.wikimedia.org/r/171830
Comment 16 John Mark Vandenberg 2014-11-09 20:20:39 UTC
Im pretty sure this is fixed now. Sorry I didnt notice this earlier.
Comment 17 Sorawee Porncharoenwase 2014-11-09 20:22:22 UTC
So what's the problem? Cache?
Comment 18 John Mark Vandenberg 2014-11-09 20:32:45 UTC
Yes.  TestRequest was forcing all subsequent queries to return the same result, consisting of the same pages, so it would never find new pages to yield.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links