Last modified: 2013-07-24 19:05:28 UTC
Column pp_propname does not have an index. That will block creation of maintenance pages based upon set page props. Such pages are important for the overall quality of the made pages, especially the quality of Wikipedia and Wikidata. One example is text analysis that sets page props due to the outcome of the analysis of the whole page. Such text analysis could for example be the readability of pages. If the readability is limited to a category it is possible to get the articles for that category and then analyse the page props. It is although not possible to query for bad readability across the whole wiki. An other example is marking pages with page props that use a specific parser function and then list those pages. The parser function could also mark the pages individually with different page props. The same problem will then arise as in the previous example. The existing table is rather large so the load problem on English Wikipedia should be considered when creating the index.
From <https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=maintenance/tables.sql;hb=HEAD>: --- -- Name/value pairs indexed by page_id CREATE TABLE /*_*/page_props ( pp_page int NOT NULL, pp_propname varbinary(60) NOT NULL, pp_value blob NOT NULL ) /*$wgDBTableOptions*/; CREATE UNIQUE INDEX /*i*/pp_page_propname ON /*_*/page_props (pp_page,pp_propname); --- What index are you proposing, exactly?
Presumably: CREATE UNIQUE INDEX /*i*/pp_propname ON /*_*/page_props (pp_propname);
Sorry, drop the 'UNIQUE'.
Yes, exactly! :)
> CREATE INDEX /*i*/pp_propname ON /*_*/page_props(pp_propname); This bug can probably be marked "easy" then, once a database person (like Asher) signs off, then? I believe "schema-change" already applies, so I'm adding that keyword now.
The change is easy, but the load right after the change could create problems. That should be given consideration, but I don't think the load will be very high after the index is created.
(In reply to comment #6) > The change is easy, but the load right after the change could create > problems. > That should be given consideration, but I don't think the load will be very > high after the index is created. Last I checked large installations apply schema updates manually and they rotate slaves out in batches only upgrading part of the cluster at a time while the rest of the cluster keeps taking care of the load.
Sounds like fixed with Gerrit change #44260