Last modified: 2014-09-01 15:38:05 UTC
Automated mass round-trip testing on actual page content would be useful to ensure proper HTML round-tripping in VE. This is very similar to your existing DOM sanity check. Basically load it into DM, export it again and check that the result is identical. You can probably reuse parts of our distributed test infrastructure for this (currently rt testing 160k pages from various wikis through Parsoid), and can directly use the cached HTML from production as the input.
Timo, Let's use this bug for what we discussed. As I suggested, we should probably run on: * enwiki featured articles (~ 4k), fixed revision (so if we regress we notice) * enwiki ~ 5k most recently-changed articles (Special:RecentChanges) * {en,fr,de,it,es,nl,he,ru,ar,ja,ko,vi}wiki ~ 5k random articles (Special:Random) Thoughts?
(In reply to comment #1) ... > * {en,fr,de,it,es,nl,he,ru,ar,ja,ko,vi}wiki ~ 5k random articles Could you add also 'pt' to this list?
(In reply to comment #2) > (In reply to comment #1) > ... > > * {en,fr,de,it,es,nl,he,ru,ar,ja,ko,vi}wiki ~ 5k random articles > Could you add also 'pt' to this list? Sure. It was just writing a quick list rather than setting it in stone. When we expand to cover language variants we'll want to expand the list further - for example, zh. :-)
So based on discussions with Gabriel: * Parsoid has a better organised infrastructure for this than we do, so let's use that as a base. Right now they periodically run their sets of roundtrip tests on a certain set of articles. 1) Change that set of articles to include and/or match James' specification. 2) Improve ve-dirtydiffbot to not just do parsoid-ve-ve-parsoid roundtrip but also parsoid-ve-ve rountrip (e.g. parsoid dom > ve linmod > ve dom; "sanity check") 2) Extend the test runner to include 2 pieces of information for each article in addition to the data parsoid gathers: - result of parsoid-dom > ve linmod > ve dom ("sanity check") - diff of parsoid-dom > ve linmod > ve dom > parsoid dom ("full wikitext roundtrip")[1] [1] this is the one that ve-dirtydiffbot is currently doing.
(In reply to comment #4) > So based on discussions with Gabriel: > > * Parsoid has a better organised infrastructure for this than we do, so let's > use that as a base. Right now they periodically run their sets of roundtrip > tests on a certain set of articles. > > 1) Change that set of articles to include and/or match James' specification. Include, not switch, please; the stuff that Parsoid is doing for RT tests should also be expanded, IMO.
*** Bug 56330 has been marked as a duplicate of this bug. ***