Last modified: 2014-07-22 15:38:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T54329, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 52329 - Provide reviewer counts per Gerrit changeset in batch form
Provide reviewer counts per Gerrit changeset in batch form
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Git/Gerrit (Other open bugs)
wmf-deployment
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: ops
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-31 17:33 UTC by MZMcBride
Modified: 2014-07-22 15:38 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description MZMcBride 2013-07-31 17:33:31 UTC
The Gerrit API is somewhat limited in that retrieving the number of reviewers (a reviewers count) for each changeset requires individual API requests. At somewhere around 76,000 changesets currently (and growing), this isn't a practical system for batch-downloading reviewer counts for all Gerrit changesets.

In discussions with Christian and Chad, a cron job via puppet that executes a simple SQL query and outputs the results of such a query to a text file (or .tsv file, most likely) is probably the best approach to take here.
Comment 1 Gerrit Notification Bot 2013-07-31 17:43:02 UTC
Change 76945 had a related patch set uploaded by Demon:
Provide reviewer counts per patch

https://gerrit.wikimedia.org/r/76945
Comment 2 Gerrit Notification Bot 2013-08-02 20:35:43 UTC
Change 76945 merged by Ryan Lane:
Provide reviewer counts per patch

https://gerrit.wikimedia.org/r/76945
Comment 3 MZMcBride 2013-08-03 02:56:56 UTC
It's 02:56 UTC right now:

$ curl https://gerrit.wikimedia.org/reviewer-counts.json
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /reviewer-counts.json was not found on this server.</p>
</body></html>

:-(
Comment 4 MZMcBride 2013-08-03 03:20:29 UTC
Tangentially: I don't love the idea of writing into the root directory here. I realize that other jobs already do (e.g., <https://gerrit.wikimedia.org/mediawiki-extensions.txt>), but it may make sense to create a separate directory for this.
Comment 5 MZMcBride 2013-08-03 03:21:29 UTC
(In reply to comment #2)
> Change 76945 merged by Ryan Lane:
> Provide reviewer counts per patch
> 
> https://gerrit.wikimedia.org/r/76945

Copying Daniel Z.'s comment here:

---
this doesn't appear to work and the reason:
fatal: gerrit2 does not have "accessDatabase" capability.
---
Comment 6 Chad H. 2013-08-03 04:36:31 UTC
(In reply to comment #4)
> Tangentially: I don't love the idea of writing into the root directory here.
> I
> realize that other jobs already do (e.g.,
> <https://gerrit.wikimedia.org/mediawiki-extensions.txt>), but it may make
> sense
> to create a separate directory for this.

 Maybe something like gerrit.wm.o/reports/foo?

(In reply to comment #5)
> (In reply to comment #2)
> > Change 76945 merged by Ryan Lane:
> > Provide reviewer counts per patch
> > 
> > https://gerrit.wikimedia.org/r/76945
> 
> Copying Daniel Z.'s comment here:
> 
> ---
> this doesn't appear to work and the reason:
> fatal: gerrit2 does not have "accessDatabase" capability.
> ---

Whoops yeah that's true. Easily fixed and I'll do it when I'm in front of my computer again.
Comment 7 MZMcBride 2013-08-03 04:37:48 UTC
(In reply to comment #6)
> Maybe something like gerrit.wm.o/reports/foo?

Yup.

> Whoops yeah that's true. Easily fixed and I'll do it when I'm in front of my
> computer again.

Yay. :-)
Comment 8 christian 2013-08-03 09:03:30 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > fatal: gerrit2 does not have "accessDatabase" capability.
> 
> [...] I'll do it when I'm in front of my
> computer again.

I granted the capability in the mean time.
Comment 9 MZMcBride 2013-08-07 23:07:25 UTC
$ curl https://gerrit.wikimedia.org/reviewer-counts.json
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /reviewer-counts.json was not found on this server.</p>
</body></html>

---

What are next steps here?
Comment 10 MZMcBride 2013-08-12 18:34:48 UTC
Chad or Christian: can you please take a look at this bug when you get a chance? I imagine it'll require either running (or finding someone to run) this command from the command line to see what error is now output.
Comment 11 Chad H. 2013-08-12 19:04:27 UTC
gerrit2 couldn't write to the output file. I fixed this, reviewer counts are now up: https://gerrit.wikimedia.org/reviewer-counts.js

Should now update every weekend.
Comment 12 MZMcBride 2013-08-13 03:42:38 UTC
(In reply to comment #11)
> gerrit2 couldn't write to the output file. I fixed this, reviewer counts are
> now up: https://gerrit.wikimedia.org/reviewer-counts.js

As discussed, this file was moved to <https://gerrit.wikimedia.org/reviewer-counts.json>. :-)  Thanks for working on this.

> Should now update every weekend.

I believe you mean daily.
Comment 13 MZMcBride 2013-08-13 03:53:32 UTC
Christian: Chad and I looked over <https://gerrit.wikimedia.org/reviewer-counts.json> and it seems to be missing about 6,100 changesets.

From <http://p.defau.lt/?XeY_5g8lrYwjWblQzZVGrQ>:

gerrit> select count(*) from changes;
 count(*)
 --------
 78406

gerrit> select count(*) from change_id;
 count(*)
 --------
 72305

Do you know why the change_id table is missing thousands of changesets? Can the query be adjusted to include all changesets?

More importantly, whatever is generating this .json file isn't outputting valid JSON. It doesn't load into Python's json.loads() and <http://jsonlint.com/> says it's invalid.

The Python error:

---
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 326, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 369, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 72311 column 1 (char 67 - 4914008)
---

The JSONLint error:

---
Parse error on line 7:
..._count": "2"    }}{    "type": "row",
---------------------^
Expecting 'EOF', '}', ',', ']'
---
Comment 14 Nemo 2013-08-13 04:45:52 UTC
No time to read the bug now, but a note in case: the count should not include the patch owner and jenkins-bot.
Comment 15 Gerrit Notification Bot 2013-08-13 10:27:14 UTC
Change 78944 had a related patch set uploaded by QChris:
Switch to changes table for gerrit's reviewer count file

https://gerrit.wikimedia.org/r/78944
Comment 16 Gerrit Notification Bot 2013-08-18 16:13:04 UTC
Change 79584 had a related patch set uploaded by QChris:
Add gsql format that returns result set as single Json object

https://gerrit.wikimedia.org/r/79584
Comment 17 Gerrit Notification Bot 2013-08-19 18:15:06 UTC
Change 79584 merged by Demon:
Add gsql format that returns result set as single Json object

https://gerrit.wikimedia.org/r/79584
Comment 18 Nemo 2013-09-13 06:27:08 UTC
+ops because that's what the patch is waiting for.
Comment 19 Gerrit Notification Bot 2013-09-17 17:43:46 UTC
Change 78944 merged by Ottomata:
Switch to changes table for gerrit's reviewer count file

https://gerrit.wikimedia.org/r/78944
Comment 20 MZMcBride 2013-09-18 01:35:56 UTC
(In reply to comment #19)
> Change 78944 merged by Ottomata:
> Switch to changes table for gerrit's reviewer count file
> 
> https://gerrit.wikimedia.org/r/78944

Thanks, Otto!

Christian: we still need the JSON --> JSON_SINGLE change, I think? After that gets merged, I think this bug will be resolved/fixed. :-)
Comment 21 Gerrit Notification Bot 2013-09-18 10:53:30 UTC
Change 84743 had a related patch set uploaded by QChris:
Switch to single Json object for gerrit's reviewer count query

https://gerrit.wikimedia.org/r/84743
Comment 22 christian 2013-09-18 10:59:19 UTC
(In reply to comment #20)
> Christian: we still need the JSON --> JSON_SINGLE change, I think?

See comment 21 :-)

> After that
> gets merged, I think this bug will be resolved/fixed. :-)

Before we can merge that, we need to upgrade to a gerrit that actually understands
JSON_SINGLE. The war file is prepared, and only needs to be put in place.
I pinged ^demon about that.
Comment 23 Andre Klapper 2013-09-19 17:27:47 UTC
[removing ops keyword again as patch in comment 21 is -1'ed currently.]
Comment 24 Gerrit Notification Bot 2013-11-03 18:00:46 UTC
Change 84743 merged by Ori.livneh:
Change output format of Gerrit review count gsql to JSON_SINGLE

https://gerrit.wikimedia.org/r/84743
Comment 25 Ori Livneh 2013-11-03 18:16:08 UTC
Actually, there's a straightforward way to improve on Ifbb8176fc that I'd like to see. Instead of redirecting the output to a file in /var/www, make a few assertions about the gsql output first -- that consists of a valid, single JSON object (and no more) and that it has the keys you expect it to have. You might want to write a simple script to do that rather than try to fit it in a single command line in the manifest. It should be straightforward to do. Christian, do you think you could do that sometime soon?
Comment 26 Nemo 2013-11-17 12:47:53 UTC
So, isn't https://gerrit.wikimedia.org/reviewer-counts.json supposed to contain something now? It's still empty.
Comment 27 MZMcBride 2013-11-17 23:08:56 UTC
(In reply to comment #26)
> So, isn't https://gerrit.wikimedia.org/reviewer-counts.json supposed to
> contain something now? It's still empty.

Looks like it broke. Probably a decent time to consider changing the path, as Ori suggested. Maybe /j/?
Comment 28 MZMcBride 2013-11-27 05:19:00 UTC
Chad or Christian: could one of you poke at this? Or do you know who might be able to?
Comment 29 christian 2013-11-27 14:06:30 UTC
Sorry MZMcBride, I cannot help here.
I had a look back when you said that it broke.
But the cron-job definition looked good to me.
Running the query by hand worked as well for me.
gerrit2 user has Access Database capability as well.

As I lack shell access to the box, I cannot look there directly and
I'll have to defer to Chad.


(I heard random ramblings about not all repos being garbage collected
properly. So some random speculation might be that the cronjob's ssh
connection fails? Maybe username mismatch between gerrit and gerrit2?
I doubt that it's the case, but it would explain both things. Meh.
Someone with shell+sudo access to the machine has to check.)
Comment 30 MZMcBride 2013-12-10 17:01:35 UTC
Daniel: when you have a minute, can you please take a look? I suspect some error is being reported to cron.
Comment 31 Daniel Zahn 2013-12-10 17:12:43 UTC
took a look at the cron on ytterbium

crontab -u gerrit2 -l

* 1 * * * ssh -p 29418 localhost gerrit gsql --format JSON_SINGLE -c "'SELECT changes.change_id AS change_id, COUNT(DISTINCT patch_set_approvals.account_id) AS reviewer_count FROM changes LEFT JOIN patch_set_approvals ON (changes.change_id = patch_set_approvals.change_id) GROUP BY changes.change_id'" > /var/www/reviewer-counts.json


looked normal, ran it with sudo -u gerrit2, it created the file just fine. i chowned it to gerrit2.

try it now

http://gerrit.wikimedia.org/reviewer-counts.json
Comment 32 Nemo 2014-07-22 15:38:32 UTC
$ curl https://gerrit.wikimedia.org/reviewer-counts.json | json_pp -json_opt pretty | grep -c '"reviewer_count" : "0"'
973

But most are abandoned patches. :(

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links