Last modified: 2013-12-23 19:48:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T58590, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 56590 - Use the HTTP API in round-trip testing
Use the HTTP API in round-trip testing
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
tests (Other open bugs)
unspecified
All All
: High normal
: ---
Assigned To: Marc Ordinas i Llopis
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-04 22:37 UTC by ssastry
Modified: 2013-12-23 19:48 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description ssastry 2013-11-04 22:37:12 UTC
Parsoid currently tests parsing and serialiation of content. Currently, testing of the HTTP API wrappers and any issues with libraries and bugs in the endpoints relies on manual testing.

Gaps in manual testing can lead to incidents like this: https://wikitech.wikimedia.org/wiki/Incident_documentation/20131104-Parsoid

So, we need an automated test setup for our HTTP API endpoints.
Comment 1 Gabriel Wicke 2013-11-04 22:38:38 UTC
Additionally, we should consider converting our round-trip tests to use the web API as well.
Comment 2 Chris McMahon 2013-11-04 22:51:38 UTC
I would very much like to create these tests, as long as beta labs and test2wiki would be targets.
Comment 3 Antoine "hashar" Musso (WMF) 2013-11-04 23:08:37 UTC
If there is anything I can do in Jenkins, let me know.
Comment 4 Gabriel Wicke 2013-11-05 04:56:02 UTC
(In reply to comment #2)
> I would very much like to create these tests, as long as beta labs and
> test2wiki would be targets.

As a first step we are considering tests that exercise the HTTP API in a way similar to the way VE does (including selser mode)

* on each commit, on a selection of pages
* in mass round-trip testing on a variety of content (see also bug 56601)

In addition it would be great to test the full Parsoid + Varnish setup as it is in production to catch caching issues. This sounds like a good fit for betalabs.

For this, we could use

* the HTTP client we'll get out of moving our rt testing to use the API, and 
* browser testing using the VE to test the full stack from browser through VE extension to the Parsoid cluster.
Comment 5 Chris McMahon 2013-11-05 14:11:15 UTC
What is the HTTP endpoint for this API?  The explicit URI(s)?  

Also, I submitted a browser test for a utf8 string, I'll refine this a bit more today.  https://gerrit.wikimedia.org/r/#/c/93597/
Comment 6 Gabriel Wicke 2013-11-05 17:35:28 UTC
(In reply to comment #5)
> What is the HTTP endpoint for this API?  The explicit URI(s)?  

https://www.mediawiki.org/wiki/Parsoid#The_Parsoid_web_API
Comment 7 Marc Ordinas i Llopis 2013-11-07 18:50:10 UTC
Divided into two tasks:
* This bug will track changing the round-trip test client to use the HTTP API so that they test a more similar code to the actual one used by clients.
* Bug #56730 tracks the (future) development of unit tests of Parsoid's HTTP API.
Comment 8 Gerrit Notification Bot 2013-11-26 16:50:34 UTC
Change 97733 had a related patch set uploaded by Marcoil:
Bug 56590: Use the HTTP API in round-trip testing

https://gerrit.wikimedia.org/r/97733
Comment 9 Marc Ordinas i Llopis 2013-11-26 17:48:49 UTC
Problems when testing this patch:

- When testing with a local Parsoid instance, it compares the two wikitext versions and gives a correct number of syntactic and semantic diffs, although the number is sometimes different to the one given by direct (no HTTP API) testing. Most of the times the differences are due to a final '\n' difference when testing through HTTP. Could be due to different selser and editMode settings?

- When testing from my local machine, but using httpL//parsoid.wmflabs.org/ as the HTTP API URL, html2wt always gives out the same wikitext as was gotten with TemplateRequest::one, which results in 0 diffs. Caching?

- node roundtrip-test.js --parsoidURL http://parsoid.wmflabs.org/ --prefix enwiki Barack_Obama
gives this error:
ERROR: Error: request entity too large
    at IncomingMessage.onData (/data/project/parsoid/js/node_modules/express/node_modules/connect/node_modules/raw-body/index.js:40:17)
    at IncomingMessage.EventEmitter.emit (events.js:88:17)
    at IncomingMessage._emitData (http.js:359:10)
    at HTTPParser.parserOnBody [as onBody] (http.js:123:21)
    at Socket.socket.ondata (http.js:1682:22)
    at TCP.onread (net.js:404:27)

- It doesn't pass Jenkins, see https://integration.wikimedia.org/ci/job/parsoid-roundtrip-test-check/1789/console
It seems that the roundtrip-test.js script can't connect to the parsoid HTTP API from the jenkins machine. We'll need to get a parsoid instance running on the jenkins test machine, using a random port.
Comment 10 Gerrit Notification Bot 2013-11-26 19:54:00 UTC
Change 97733 abandoned by Marcoil:
Bug 56590: Use the HTTP API in round-trip testing

Reason:
This needs more work to work anywhere besides my local setup, a new patch will come soon.

https://gerrit.wikimedia.org/r/97733
Comment 11 Antoine "hashar" Musso (WMF) 2013-11-26 22:33:38 UTC
(In reply to comment #9)
<snip>
> - It doesn't pass Jenkins, see
> https://integration.wikimedia.org/ci/job/parsoid-roundtrip-test-check/1789/
> console
> It seems that the roundtrip-test.js script can't connect to the parsoid HTTP
> API from the jenkins machine. We'll need to get a parsoid instance running on
> the jenkins test machine, using a random port.

That build has been run on lanthanum.eqiad.wmnet which does not have direct access to internet.  We can tie the job to a Jenkins slave that has internet access though. 11pm right now so I will forget about it, but a bug against Wikimedia > Continuous integration would make sure it get solved.
Comment 12 Marc Ordinas i Llopis 2013-11-27 10:03:49 UTC
(In reply to comment #11)
> (snip)
> That build has been run on lanthanum.eqiad.wmnet which does not have direct
> access to internet.  We can tie the job to a Jenkins slave that has internet
> access though. 11pm right now so I will forget about it, but a bug against
> Wikimedia > Continuous integration would make sure it get solved.

Thanks for the info, Antoine. Even though we'll be running an independent Parsoid API server when running roundtrip-test.js, that will need internet access to fetch wikitext from the MediaWiki API. I'll open that bug when the code is ready.
Comment 13 Gerrit Notification Bot 2013-12-05 10:29:55 UTC
Change 99348 had a related patch set uploaded by Marcoil:
Bug 56590: Use the Parsoid HTTP API in round-trip testing

https://gerrit.wikimedia.org/r/99348
Comment 14 Antoine "hashar" Musso (WMF) 2013-12-05 14:27:07 UTC
(In reply to comment #12)
<snip>
> Thanks for the info, Antoine. Even though we'll be running an independent
> Parsoid API server when running roundtrip-test.js, that will need internet
> access to fetch wikitext from the MediaWiki API. I'll open that bug when the
> code is ready.

Shouldn't you mock the API access?  I mean Parsoid could be injected fixtures with known articles content and use those flat files instead of relying on the remote wiki.   That would save the internet HTTP get and make sure you are testing known value.

Of course, I don't think Parsoid support that kind of injection (yet?).
Comment 15 ssastry 2013-12-05 17:28:38 UTC
So far, we've been doing some basic integration testing on a single page (en:Barack_Obama) in a Jenkins job after each commit. While we could fully mock up all API accesses, for large pages like en:Barack_Obama, this would require us to effectively do a capture and replay of all API accesses for those pages (since it is not just wikitext, but also templates, extensions, images, etc. -- some 100s to 1000s of API calls).

Scott does have some basic code for capturing these accesses and dumping them to a file and using that to do a replay (without relying on internet). But, that seems like extra complexity instead of relying on internet access if we really want to do full-page integration testing on a set of N (for small values of N, say 5) pages after each commit.

So, I think the real question is whether we should be doing full page testing in a Jenkins/CI job after each commit or not. If yes, it seems simpler to rely on HTTP.

All that said, we do have a mock MW API server for testing PHP extensions (bug 45440). See parsoid/js/tests/mockAPI.js. Currently, we use this for running parserTests only.
Comment 16 Gerrit Notification Bot 2013-12-16 22:20:07 UTC
Change 102011 had a related patch set uploaded by GWicke:
Bug 56590: Use the Parsoid HTTP API in round-trip testing

https://gerrit.wikimedia.org/r/102011
Comment 17 Gerrit Notification Bot 2013-12-17 22:35:37 UTC
Change 99348 abandoned by GWicke:
Bug 56590: Use the Parsoid HTTP API in round-trip testing

Reason:
Moved to https://gerrit.wikimedia.org/r/#/c/102011/

https://gerrit.wikimedia.org/r/99348
Comment 18 Gerrit Notification Bot 2013-12-19 23:17:37 UTC
Change 102011 merged by jenkins-bot:
Bug 56590: Use the Parsoid HTTP API in round-trip testing

https://gerrit.wikimedia.org/r/102011

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links