Last modified: 2012-01-09 17:15:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T34404, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 32404 - Wrong pagelinks to images and user discussion pages in namespace 0
Wrong pagelinks to images and user discussion pages in namespace 0
Status: RESOLVED DUPLICATE of bug 33409
Product: MediaWiki
Classification: Unclassified
Maintenance scripts (Other open bugs)
1.20.x
All All
: Normal major (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-14 14:53 UTC by Malafaya
Modified: 2012-01-09 17:15 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Malafaya 2011-11-14 14:53:48 UTC
Currently (14-Nov-2011 dump), at pt.wiktionary, we have entries in the page_links table such as:

pl_from   pl_namespace      pl_title
45839        6            Crystal_Clear_app_aim2.png
258396       0            Imagem:Flag_of_Esperanto.svg

Although "Imagem" is an alias for namespace 6, the second record above appears as a namespace 0 page, with namespace 6 prefix.

RoanKattouw and apergos realized that running namespaceDupes.php does not correct the situation because it doesn't touch the pagelinks table.

Somehow, a maintenance operation is needed to update namespace dupes in tables other than the page table. Possibly, it could be included in the main namespaceDupes.php script.
Comment 1 Ariel T. Glenn 2011-11-14 14:56:32 UTC
It looks like pagelinks, imagelinks and externallinks (why does that have links to local images anyways?) all need cleaned up across the various wikis.
Comment 2 db [inactive,noenotif] 2011-11-14 16:48:00 UTC
See also bug 32170
Comment 3 Brion Vibber 2011-11-14 19:53:40 UTC
I'm not sure that bug 32170 is related; that mentions the bad links specifically being in the imagelinks table, while here they're listed as being in the pagelinks table.


As for general cleanup; I believe the operating assumption on namespaceDupes etc is that you're expected to run rebuildLinks after doing this sort of title cleanup. But since you may be doing multiple such cleanup runs, it's not going to make any assumptions and do it for you.
Comment 4 Ariel T. Glenn 2011-11-14 19:58:17 UTC
Links get left over in imagelinks and pagelinks. 

I don't think we want to run rebuildlinks on de.wikipedia (for example).  So maybe we need to change those operating assumptions.  At worst the script could take an option.
Comment 5 Malafaya 2011-11-30 12:09:39 UTC
Would touching the affected pages correct the problem in the wiki?
Comment 6 Malafaya 2011-11-30 15:31:45 UTC
Answering my own question, and with some online help from RowanKattow, it does correct the problem.
Comment 7 Malafaya 2011-12-09 11:50:13 UTC
With yesterday's ptwiktionary dump, a lot of them came back. A *new symptom* is the presence of pages in the User and User_talk namespaces being considered in namespace 0.
Comment 8 Malafaya 2011-12-09 12:00:12 UTC
pl_from   pl_namespace      pl_title
88184          0         Usuário:Alkamid
55566          0         Usuário:Antoniolac
55566          0         Usuário:Cadum


Usuário is an alias for the User namespace, but not the canonical form.
Actually, the canonical forms never seem to been in problematic records, just other aliases.
Comment 9 Malafaya 2011-12-27 14:19:07 UTC
You guys can tell for sure, but this list of erroneous links is very volatile and makes me wonder whether the solution resides in a manually run script. In every database dump, there are a few more different links with this problem. It's not just a problem of old links that need to be updated.
Comment 10 Malafaya 2012-01-02 16:33:11 UTC
Please, check comment 23 on bug 31576. This may be a similar situation, of older copies of MW still rendering pages.
Comment 11 Tim Starling 2012-01-03 21:58:53 UTC
I'm going to dupe both this and bug 32170 to bug 33409, it's pretty obvious that's what's happening. This bug is identical to bug 32170, they're both caused by namespaceAliases being empty. In fact in bug 32170 I even mentioned that it causes local namespace aliases for File: to appear as pseudo-namespaces in pagelinks.

We can't really say anything about the effectiveness of CdbReader_PHP change on December 14 at this stage, since even today we had job runners running on an old code base. But it's likely that when we finally manage to completely update our code, the frequency at which bad entries are added will be dramatically reduced.

*** This bug has been marked as a duplicate of bug 33409 ***
Comment 12 Malafaya 2012-01-09 17:15:50 UTC
New occurrences have been found today, comparing to the situation of 5th January in pt.wiktionary. This is *after* the decomissioned workers found before the 5th were killed manually.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links