Last modified: 2014-10-14 18:57:07 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T65203, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 63203 - Integrate wikimetrics with mediawiki-utilities
Integrate wikimetrics with mediawiki-utilities
Status: NEW
Product: Analytics
Classification: Unclassified
Wikimetrics (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-27 20:39 UTC by Dan Andreescu
Modified: 2014-10-14 18:57 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Dan Andreescu 2014-03-27 20:39:56 UTC
https://github.com/halfak/mediawiki-utilities
Comment 1 Bingle 2014-03-27 20:40:33 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1502
Comment 2 nuria 2014-10-14 17:10:39 UTC
Given that mediawiki utilities uses plain SQL and we are set to use alembic I do not see how this 'integration' could happen. Are we sure we want to keep this bug open?
Comment 3 Dan Andreescu 2014-10-14 18:57:07 UTC
wikimetrics is using sqlalchemy, and that's a bit of a mismatch with mediawiki utilities.  I don't think that's too big of a deal, we could integrate the tools if it's a good idea.  But that depends on which way wikimetrics as a product goes, and how we structure our data pipeline.

One possibility is to have wikimetrics become the ETL tool for public data.  It could restructure our OLTP + recent changes + event streams into a more traditional, easy to work with, data warehouse.  In that case, the logic from mediawiki-utilities would be very useful.  We may wish to convert some of it to sqlalchemy, but that's a minor point.

Another possibility is to have a separate ETL process, based on an existing tool or a combination of tools.  Wikimetrics would then be re-fashioned to query on top of the resulting data warehouse.  In that case, mediawiki-utilities could be used to inform the ETL process but it would have a very different purpose from Wikimetrics.

I'm not opinionated on which way we go, but I think we should keep this bug open as a reminder of the great logic encapsulated in mediawiki-utilities.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links