Last modified: 2014-06-11 15:47:36 UTC
We need to split up our tests into two folders basically: unit_tests and integration_tests. For the integration tests: the scheduler and queue should always be running, because that's how it would behave in a real scenario. For the unit tests, obviously we shouldn't start the queue or scheduler at all. At this time wikimetrics is treating every test as an integration test, starting the queue even for simple unit tests.
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1533
Currently, tests for wikimetrics are failing at random :-( People tell me that fixing this bug, will make tests deterministic again.
Yes, fixing this bug in my opinion means: * separate unit tests and integration tests * run only unit tests during development * Jenkins runs unit tests automatically along with the linting job it already runs. It verifies +2 iff all unit tests pass and linting passes * have a separate way to run integration tests * make sure all integration tests complete the same way every time (are deterministic.) This means, for example, we should have no "time.sleep" code that may or may not be sleeping the right amount of time. The tests known currently to fail randomly are those using time.sleep and all those in the RunReportClassMethodsTest class.
Adding Kevin My feeling is that we need to fix the tests to ship the features they test but I'm happy to listen to alternative viewpoints.
Fixing the tests in this case is not possible without a large effort. And my point was that adding that large effort into an already immense Gerrit Change was definitely a worse idea than just adding non-deterministic tests. Since we agreed as a team to not break down the very large commit, I think I was stuck between a rock and a hard place. Nevermind that, looking forward I think the thing to do is to treat this bug like a production issue / blocker to delivering the feature.
>Fixing the tests in this case is not possible without a large effort. I think fixing the non deterministic tests should not be such a large effort. Not to be confused with fixing all problems with test setup. The problems come from two places: 1) Sleeps. Adding sleeps to test is always going to be a source of undeterminism. Tests with sleeps can be fixed by making celery run in synchronous, not asynchronous mode when doing integration testing. This, in itself, seems quite a simple change: http://docs.celeryproject.org/en/latest/configuration.html#task-execution-settings We already have the ability to override the test configuration and add this setting to celery only for testing. We will of course need to test that this setting in celery works as intended. 2) Tests that "undo" some of the test setup. Here is where refactor is needed so not all tests inherit the same setup functions that setup celery and scheduler when those two are not needed. This change is more work than the change above but I think it involves a simple refactor of some of the test setup.
Change 124872 had a related patch set uploaded by Milimetric: Address parts of Bug 63680 https://gerrit.wikimedia.org/r/124872
Change 125102 had a related patch set uploaded by QChris: Address parts of Bug 63680 https://gerrit.wikimedia.org/r/125102
Change 124872 had a related patch set uploaded by QChris: Do not run non-deterministic RunReport tests per default https://gerrit.wikimedia.org/r/124872
Change 124872 merged by Nuria: Do not run non-deterministic RunReport tests per default https://gerrit.wikimedia.org/r/124872
(In reply to christian from comment #2) > Currently, tests for wikimetrics are failing at random :-( > > People tell me that fixing this bug, will make tests deterministic again. Although this bug is not yet fixed, merging change 124872 gave us a working test suite again.
Per team -- this bug does not refer to making the tests deterministic. We're creating a new bug for that. The patch actually addresses the deterministic issue.
64059 is the new bug.
Closing this issue as will not fix. We'll take care of the testing problems in bug 64059.
Change 125102 abandoned by Milimetric: Address parts of Bug 63680 Reason: abandoning this line of thought, continuing on https://bugzilla.wikimedia.org/show_bug.cgi?id=64059 https://gerrit.wikimedia.org/r/125102