Last modified: 2014-05-16 19:26:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T52316, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 50316 - Generate selser change assignments dynamically
Generate selser change assignments dynamically
Status: NEW
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal enhancement
: ---
Assigned To: Gabriel Wicke
:
: 49222 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-27 19:24 UTC by Gabriel Wicke
Modified: 2014-05-16 19:26 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gabriel Wicke 2013-06-27 19:24:39 UTC
* Create selser change assignments dynamically. Currently we rely on an external file that needs to be updated manually. The generation is actually already deterministic with a seeded PRNG (seed is the test title), and the overhead of dynamic generation was around 1 second for a full 60-second test run IIRC. So drop the external file and always generate assignments on the fly.

* Speed up selser change assignments. Currently we generate & test for duplicates. We are really generating permutations, which can be done much quicker.

* Remember the output of failing (blacklisted) tests and fail if that output changes. We have many tests where our output is actually correct, but due to limitations in the test setup the test is still failing. This can be a difference to the PHP parser output or something like comparing to wt2wt output in selser testing which expects normalization of attribute quoting etc. By failing on changing blacklisted test output we can still catch regressions in our behavior for these tests. We'll also see improvements that are not quite enough to make the tests pass yet. Rewriting the blacklist is easy enough and documents the changes in failing test output along with the commit.
Comment 1 C. Scott Ananian 2013-07-29 17:10:36 UTC
It would probably be worth fixing bug 50982 first, while you can easily see the empty selser changes in the output file.  Subbu thinks these might be the tests "without wt2wt parsoid option ... would be good to atleast verify/confirm that hypothesis."
Comment 2 Arlo Breault 2013-07-29 18:52:09 UTC
Re: comment 1. Unfortunately, that doesn't appear to be the case. A counter example is "Parsoid only: Quote balancing context should be ..." which has the options "parsoid=wt2html,wt2wt".
Comment 3 Gerrit Notification Bot 2013-07-31 01:21:45 UTC
Change 76870 had a related patch set uploaded by Arlolra:
Generate selser change assignments dynamically.

https://gerrit.wikimedia.org/r/76870
Comment 4 Gerrit Notification Bot 2013-08-01 16:15:51 UTC
Change 76870 merged by jenkins-bot:
Generate selser change assignments dynamically.

https://gerrit.wikimedia.org/r/76870
Comment 5 ssastry 2013-08-02 21:54:42 UTC
3rd bullet point in bug description is actually bug 51718 -- need to figure out best approach for this (work through what is best -- technique as outlined in #3 here or something else).
Comment 6 Gabriel Wicke 2013-08-16 18:11:47 UTC
With generate & test our assignments are not guaranteed to be exhaustive, which might be relevant for bug 52139. It might be worth moving to direct permutation generation instead, as that should also make change generation faster.
Comment 7 Arlo Breault 2013-09-21 21:04:22 UTC
gwicke: In what way are these permutations? From the blacklist,

add("selser", "Non-word characters don't terminate tag names (bug 17663, 40670, 52022) [[3],3,[3],3,4,3,4,4,4,2,3]");

this just looks like combinations with replacements. Given that there are 11 numbers between 2 and 4 inclusive, you'd have to generate 3^11 changes, rather than 20 random ones. That doesn't seem faster.
Comment 8 Gabriel Wicke 2013-09-21 21:40:12 UTC
Deterministic generation will be faster than random generate & test, as the latter will often result in duplicates which are then filtered out. Keep in mind that we try to generate a random assignment up to 1000 times, even if there are only a handful possible permutations in a small test. The extra attempts to generate permutations will just generate duplicates once the few possible permutations have been found. 

I agree that we'll need to limit the number of permutations we generate for large test cases. That means that generating all permutations with the current assignments won't be possible. On the bright side, there is a chance that we can get away with less permutations without really losing test coverage. As an example, case 2 (node insertion before current node) and case 4 (child node insertion) can result in the same actual change, so should probably be collapsed when that happens. Similarly, new node insertion is very similar to attribute changes for selser processing: the full 'outerwikitext' needs to be serialized in both cases. Lets discuss the possible cases and think about which cases need to be handled.
Comment 9 Gerrit Notification Bot 2013-09-25 04:00:18 UTC
Change 85952 had a related patch set uploaded by Arlolra:
WIP: Remember the output of failing (blacklisted) tests

https://gerrit.wikimedia.org/r/85952
Comment 10 Arlo Breault 2013-09-27 20:01:56 UTC
*** Bug 49222 has been marked as a duplicate of this bug. ***
Comment 11 Gerrit Notification Bot 2013-10-01 22:34:01 UTC
Change 85952 merged by jenkins-bot:
Remember the output of failing (blacklisted) tests

https://gerrit.wikimedia.org/r/85952
Comment 12 Andre Klapper 2014-02-12 15:58:57 UTC
Gabriel: All patches merged months ago - is there more work left here, or can you close this ticket as RESOLVED FIXED?
Comment 13 ssastry 2014-02-12 18:41:41 UTC
We have a good workable solution for now, but I think Gabriel had the enhancement idea of generating selser tests by going through permutations. Gabriel: do you want to create a different enhancement ticket for it and close this one?
Comment 14 Andre Klapper 2014-03-13 11:39:42 UTC
(In reply to ssastry from comment #13)
> We have a good workable solution for now, but I think Gabriel had the
> enhancement idea of generating selser tests by going through permutations.
> Gabriel: do you want to create a different enhancement ticket for it and
> close this one?

Gabriel: ping?
Comment 15 Gabriel Wicke 2014-05-16 19:26:48 UTC
Lets keep using this bug, but reclassify it as an enhancement.

Our selser test coverage can be improved further. Generating permutations systematically still seems to be a promising candidate solution for doing so.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links