Last modified: 2013-07-24 19:05:28 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T47316, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 45316 - Column pp_propname in table PageProps should have an index
Column pp_propname in table PageProps should have an index
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
1.21.x
All All
: Normal normal (vote)
: 1.22.0 release
Assigned To: Nobody - You can work on this!
: schema-change
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-23 22:38 UTC by jeblad
Modified: 2013-07-24 19:05 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description jeblad 2013-02-23 22:38:35 UTC
Column pp_propname does not have an index. That will block creation of maintenance pages based upon set page props. Such pages are important for the overall quality of the made pages, especially the quality of Wikipedia and Wikidata.

One example is text analysis that sets page props due to the outcome of the analysis of the whole page. Such text analysis could for example be the readability of pages. If the readability is limited to a category it is possible to get the articles for that category and then analyse the page props. It is although not possible to query for bad readability across the whole wiki.

An other example is marking pages with page props that use a specific parser function and then list those pages. The parser function could also mark the pages individually with different page props. The same problem will then arise as in the previous example.

The existing table is rather large so the load problem on English Wikipedia should be considered when creating the index.
Comment 1 MZMcBride 2013-02-23 22:42:45 UTC
From <https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=maintenance/tables.sql;hb=HEAD>:

---
-- Name/value pairs indexed by page_id
CREATE TABLE /*_*/page_props (
  pp_page int NOT NULL,
  pp_propname varbinary(60) NOT NULL,
  pp_value blob NOT NULL
) /*$wgDBTableOptions*/;

CREATE UNIQUE INDEX /*i*/pp_page_propname ON /*_*/page_props (pp_page,pp_propname);
---

What index are you proposing, exactly?
Comment 2 Daniel Friesen 2013-02-23 23:10:01 UTC
Presumably:
CREATE UNIQUE INDEX /*i*/pp_propname ON /*_*/page_props
(pp_propname);
Comment 3 Daniel Friesen 2013-02-23 23:10:17 UTC
Sorry, drop the 'UNIQUE'.
Comment 4 jeblad 2013-02-23 23:56:03 UTC
Yes, exactly! :)
Comment 5 MZMcBride 2013-02-24 01:47:05 UTC
> CREATE INDEX /*i*/pp_propname ON /*_*/page_props(pp_propname);

This bug can probably be marked "easy" then, once a database person (like Asher) signs off, then?

I believe "schema-change" already applies, so I'm adding that keyword now.
Comment 6 jeblad 2013-02-24 13:06:21 UTC
The change is easy, but the load right after the change could create problems. That should be given consideration, but I don't think the load will be very high after the index is created.
Comment 7 Daniel Friesen 2013-02-24 13:40:24 UTC
(In reply to comment #6)
> The change is easy, but the load right after the change could create
> problems.
> That should be given consideration, but I don't think the load will be very
> high after the index is created.

Last I checked large installations apply schema updates manually and they rotate slaves out in batches only upgrading part of the cluster at a time while the rest of the cluster keeps taking care of the load.
Comment 8 Umherirrender 2013-07-24 19:02:02 UTC
Sounds like fixed with Gerrit change #44260

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links