Last modified: 2013-02-18 11:02:40 UTC
I'm trying to gather metadata about every Gerrit changeset. I read <https://gerrit.wikimedia.org/r/Documentation/rest-api-changes.html>, which states "If the n query parameter is supplied and additional changes exist that match the query beyond the end, the last change object has a _more_changes: true JSON field set. Callers can resume a query with the n query parameter, supplying the last change’s _sortkey field as the value." Here's what I tried: --- $ curl -s "https://gerrit.wikimedia.org/r/changes/?q=age:1second&n=500" | tail -20 "project": "mediawiki/extensions/Polyglot", "branch": "master", "topic": "tidyup", "change_id": "I35ebc242fcf04e5b527631d6be67d1a8c78ef251", "subject": "Add method parameter documentation", "status": "NEW", "created": "2013-01-24 18:41:09.000000000", "updated": "2013-01-24 21:44:15.000000000", "_sortkey": "0022a6180000b1ff", "_number": 45567, "owner": { "name": "Reedy" }, "labels": { "Verified": {}, "Code-Review": {} }, "_more_changes": true } ] $ curl -s "https://gerrit.wikimedia.org/r/changes/?q=age:1second&n=0022a6180000b1ff" "0022a6180000b1ff" is not a valid value for "-n" --- I tried other URL parameters such as &sortkey= and &_sortkey= and &sortkey_after and &resume_sortkey, but nothing seems to work. After discussing this issue with qchris in #gerrit on freenode, it seems that Gerrit's search functionality is broken (or perhaps restricted). qchris pointed to this (non-working) search example: --- $ curl -s "https://gerrit.wikimedia.org/r/changes/?q=status:merged+project:mediawiki/core+sortkey_after:m" )]}' [] --- It's unclear whether this issue has a corresponding bug in Gerrit's bug tracker. As it stands, it appears to be impossible to pull metadata of more than 500 changesets from the Gerrit REST API. Without the ability to specify an offset (and consequently retrieve information about more than 500 changesets), I'm unable to generate Gerrit reports ([[mw:Gerrit/Reports]]). :-(
Some more lines from #gerrit, showing how to fetch the required changes through the search query string. 09:54 <qchris> Susan: I screwed up before. It seems thinking is easier after breakfast. sortkey is not the bug title but the change's sort key. Stupid me. 09:54 <qchris> So what it comes down to is, that you could fetch all changes like this: 09:54 <qchris> Fetch https://gerrit.wikimedia.org/r/changes/?q=status:merged+project:mediawiki/core+limit:3 09:55 <qchris> (For whatever value of status, project you are interested in. Limit is just to get nice small file to look at by hand. You can drop that) 09:55 <qchris> Look for the _sortkey field of the last object in the result list 09:55 <qchris> And the fetch https://gerrit.wikimedia.org/r/changes/?q=status:merged+project:mediawiki/core+limit:3+sortkey_before:LAST_SORTKEY_OF_PREVIOUS_REQUEST 09:55 <qchris> Where LAST_SORTKEY_OF_PREVIOUS_REQUEST is the last sort key of the previous request 09:55 <qchris> so something like 002327db0000c0f1 To also get the _more_changes field set, use the URL parameter limit instead of the query parameter.
I pushed a change to correct the REST API documentation upstream https://gerrit-review.googlesource.com/#/c/42421/
Thank you very much for your help, Christian. I failed to realize that &n= is distinct from &N= in Gerrit's REST API. I also failed to realize that "sortkey_after" and "sortkey_before" existed (they're documented at the bottom of <https://gerrit.wikimedia.org/r/Documentation/user-search.html>). It might be nice to mention these (or cross-reference them) at <https://gerrit.wikimedia.org/r/Documentation/rest-api-changes.html>. My current understanding is that these are equivalent: * ?q=limit:[integer] and &n=[integer] * ?q=sortkey_after:[sortkey] and &N=[sortkey] * ?q=sortkey_before:[sortkey] and &P=[sortkey] Marking this bug resolved/worksforme.
(In reply to comment #3) > My current understanding is that these are equivalent: > > * ?q=limit:[integer] and &n=[integer] Yes, they are more or less equivalent. However, &n=[integer] provides you with a "_more_changes" field, while ?q=limit:[integer] does not. You can however circumvent this difference, by keeping the limiting integer below your queryLimit while still asking for 1 more result than needed. In some edge cases, gerrit will even give you one more result than your queryLimit allows for. > * ?q=sortkey_after:[sortkey] and &N=[sortkey] > * ?q=sortkey_before:[sortkey] and &P=[sortkey] It's actually the other way round. ?q=sortkey_after corresponds to &P= ?q=sortkey_before corresponds to &N= * sortkey is increasing for new changes. * &N= is for the /N/ext page of search results (i.e.: older changes, lower sortkeys, hence sortkey_before) * &P= is for the /P/revious page of search results (i.e.: newer changes, higher sortkeys, hence sortkey_after) As confusing as this is already, there are further differences. ?q=sortkey_after skips the first search result. So when comparing https://gerrit.wikimedia.org/r/changes/?P=00232fbd0000bc17&n=3 https://gerrit.wikimedia.org/r/changes/?q=sortkey_after:00232fbd0000bc17&n=3 you'll get something like sortkey In q=sortkey... ? In P=... ? ... ... ... 002330130000c1d5 no no 0023300f0000c1d8 no no 00232fff0000bc7b yes no 00232ff10000c1d6 yes yes 00232fec0000c1d3 yes yes 00232fd50000bf53 no yes 00232fbd0000bc17 <---- used sortkey Additionally, if you supply a &n=[integer] parameter to limit the number of results, the result set for a ?P= query has the "_more_changes" key set on the first object, while a &sortkey_after= query has it set on the last object. (This result skipping, and shuffling around "_more_changes" does not occur for ?q=sortkey_before or ?N= queries) Bottom line: When trying to process the data automatically, I'd go for using &n=[integer] to obtain "_more_changes" marker, but I would not rely on getting at most [integer] results. Be prepared that there may be one additional result in the result set. Furthermore, I'd go for the &N=, and &P= variants, keeping in mind that the "_more_changes" need not be at the end.