Last modified: 2014-01-07 09:32:30 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T40962, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 38962 - Design and decide how redirects should work
Design and decide how redirects should work
Status: VERIFIED FIXED
Product: MediaWiki extensions
Classification: Unclassified
WikidataRepo (Other open bugs)
unspecified
All All
: High major with 6 votes (vote)
: ---
Assigned To: Wikidata bugs
:
Depends on:
Blocks: 57744
  Show dependency treegraph
 
Reported: 2012-08-02 12:04 UTC by denny vrandecic
Modified: 2014-01-07 09:32 UTC (History)
12 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description denny vrandecic 2012-08-02 12:04:17 UTC
Redirects within the item namespace, redirects into and from the item namespace, etc. How are they created, why are they needed, etc? Design and specify their usage. May be closely related to bug 38664 (Merge items)
Comment 1 T. H. Kelly (Pink&) 2013-06-04 14:47:56 UTC
I think redirects should be created through two processes:

Through special pages: For things like bug 38664, where a built-in tool will exist to make some sort of change that results in the elimination of an item, where there's another item on the same topic as the now-irrelevant item.

Manually: There should be a 'redirectitem' user right (the assignment of which would be determined by the community, but would probably by default only be given to admins), which would allow a user who holds it to redirect a deleted item to another item. This would be primarily for cases where an item's been merged manually (i.e. all current merges, since there's still no special page for merges, and, presumably, some merges nonetheless even once we have a special page), as well as for cases such as when an item's been deleted after its linked article was deleted, but then the linked article is created and a new item is made for it (see also bug 49100). There have also been some cases where a vandal removes a sitelink from an item, and then a bot just creates a new item for that link. All of these aren't huge issues right now, but are clearly something we're going to need to deal with if we expect third-party sites to want to use our information - it would be very difficult to deal with Q#s suddenly becoming invalid, and their topics being replaced by new numbers.

I'm obviously not commenting on the technical part of this, since I know nothing about that stuff, but I'm just giving a user-experience perspective of the situations where item redirects are necessary, and how they should be implemented.
Comment 2 Addshore 2013-06-21 13:19:12 UTC
I agree with everything that has been said in the comment above.

Being able to redirect items would greatly reduce the work of admins having to delete items that have been merged, the user right mentioned above could be assigned to a group of people that have proven they know how to merge items without losing any data.

This feature would make it allot easier for external sites to use data on wikidata, the risk of an item 'moving' from one unique id to another would be reduced.

Also once this is implemented it would be great to go back through every deleted item and try to 'redirect' as many as possible to their new locations. Many deletions currently include a link to the new / merged item in the summary. I will try to write a script to do some analysis of this.
Comment 3 Addshore 2013-06-21 19:36:48 UTC
My speedy analysis shows approximately 218,060 items that are deleted and that could more than likely be redirected to another item.

These are simply items that reference another item in their deletion summary.

As far as I know this is 218,060 out of 532,144 which is about 40%.
Comment 4 denny vrandecic 2013-07-10 15:14:19 UTC
What happens when you are on a redirect page and edit?
What happens when you try to EditEntity a redirect through the API?
How do history and diffs of redirects look like?
Undo on redirects?
How to display redirected entities in the UI?
How should ChangeOps deal with redirects?
Comment 5 Addshore 2013-07-10 15:36:58 UTC
Just a few comments / ideas on the above down so I don't forget :)

What happens when you are on a redirect page and edit?
* If you are on a redirect page and edit you should have the ability to edit the redirect, but when visiting the redirect in the first place you would ,, be redirected to the target item, genrally you would only find yourself on the actual redirect page you are there because you want to be there and edit it.

What happens when you try to EditEntity a redirect through the API?
* This is an interesting one :/ I almost feel like saying you cant.. Imagine if a bot were to get a list of items and work on them throught a day, during the day one of the items gets redirected by another user, the bot then comes around to make its edit, if it had the ability to edit it may simply overwrite the page, which may not be the right thing to do. Perhaps a parameter to not follow redirects, so by default they would be followed, but if you really want to you can still edit the entity from the api (there may be a case this is needed)

How do history and diffs of redirects look like?
* History simply shows the changes made to the item up to the point of redirect and then the redirecting edit.
* Diff would show all content in the entity on the left before the redirect and on the right something to signify the redirect (maybe #Redirect [[Q12345]]) is we want to stay along the lines of MW

Undo on redirects?
* Naturally there should be an undo button incase mistakes are made. Potentially this could be something limited via permissions.

How to display redirected entities in the UI?
* Pretty much the same as in core, a simple (Redirected from Q12345). In core this is just below the title of an article. It could go in the same place in wikidata (below the label), it could also go just above, benefit of this would be no content moving slightly from one non redirected item to another non redirected.
Comment 6 Kunal Mehta (Legoktm) 2013-07-10 17:34:44 UTC
(In reply to comment #5)

> What happens when you try to EditEntity a redirect through the API?
> * This is an interesting one :/ I almost feel like saying you cant.. Imagine
> if
> a bot were to get a list of items and work on them throught a day, during the
> day one of the items gets redirected by another user, the bot then comes
> around
> to make its edit, if it had the ability to edit it may simply overwrite the
> page, which may not be the right thing to do. Perhaps a parameter to not
> follow
> redirects, so by default they would be followed, but if you really want to
> you
> can still edit the entity from the api (there may be a case this is needed)

In the standard API you can use the &redirects parameter (https://www.mediawiki.org/wiki/API:Query#Resolving_redirects). Something like that would work.
Comment 7 T. H. Kelly (Pink&) 2013-07-10 22:13:11 UTC
One technical note: On your standard MW redirect, you're still "on" the redirected page, instead of on the target. I.e., if you go to [[Foo]], you're viewing the content of [[Foobar]], but the URL still reads http://en.wikipedia.org/wiki/Foo. It occcurs to me that this behavior could be rather inconvenient with Wikibase: Redirects will, inherently, be somewhat less stable than normal Q#s (since someone can always change a redirect's target), so we want to incentivize people to use the "right" Q#; also, the URL is normally the only place a non-bot user can figure out an item's Q# without having to click somewhere else. So, once a redirect system actually gets built, I think it would be best if the URL actually redirected, like with a special page. (I don't know how technically feasible this is, of course.) And then we could have, like, a &redirectfrom= URL parameter, which would generate that familiar "redirected from Q000" text.

Oh, also...

(In reply to comment #5)

> * Diff would show all content in the entity on the left before the redirect
> and
> on the right something to signify the redirect (maybe #Redirect [[Q12345]])
> is
> we want to stay along the lines of MW

Ewww, no #REDIRECT. Wikibase already takes advantage of ContentHandler's having liberated us from unnecessary use of MediaWiki markup. I'm sure some of the devs can make us some pretty new Wikibase-type redirect content model. For diff view, it could just be "redirect-target: Q000" or something like that.

(That's assuming we even want to have the pre-redirect history shown... seeing as the redirect's target should have all the data the redirect had, we *could* have it that either only deleted items can be redirected, or that redirecting an item automatically deletes its old history.)
Comment 8 filceolaire 2013-08-05 22:07:20 UTC
I have been using the 'Merge' tool and even with our current (fairly low) level of activity I often get a message telling me the tool could not create a delete message because of an edit conflict. 

If the merge tool automatically created a redirect then these edit conflicts would not happen.

If the conversion to a redirect could be reversed easily then it would not be necessary to limit access to the revert tool to special approved persons and it could be made available to anyone using the merge tool.

Suggested workflow to revert a redirect:
* go to the redirect destination page.
* click "What links here"
* select the link to the redirect page.
* Revert the redirect (via the history page or a special tool or whatever code magic you come up with).
Comment 9 Gerrit Notification Bot 2013-11-26 13:41:12 UTC
Change 89810 had a related patch set uploaded by Daniel Kinzler:
Allow ItemContent to represent a redirect.

https://gerrit.wikimedia.org/r/89810
Comment 10 Daniel Kinzler 2013-11-29 15:09:17 UTC
Redirects will be implemented to be supported on the level of EntityContent, but not Entity. That is, they exist on the level of MediaWiki, not on the level of the Wikibase data model.
Comment 11 Bene* 2013-12-01 19:59:54 UTC
(In reply to comment #10)

If you say "on the level of MediaWiki" I guess you mean on the level of WikibaseRepo extension, right? Or do you mean Wikibase should handle redirects with the MediaWiki built-in feature for them?
Comment 12 Daniel Kinzler 2013-12-02 09:51:35 UTC
@Bene: i mean Wikibase should handle redirects with the MediaWiki built-in feature for them. That sort of thing is exactly why the Content interface exists.

But I'm unsure how to go about it. I just sent a mail to wikidata-tech, asking for feedback. Here's the mail:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hello all!

Once more, I'm taking a whack at implementing support for entity redirects -
that is, we want Q1234 to become a redirect to Q2345 after we merged Q1234 into
Q2345.

My question is how to best model this.

First off, a quick primer of the PHP classes involved:

* We have the Entity base class, with Item and Property deriving from it, for
modeling entities.

* As "glue" for treating entities as MediaWiki page content, we have the
EntityContent base class, and ItemContent and PropertyContent deriving from that
(along with their handler/factory classes, EntityHandler, ItemHandler and
PropertyHandler).

Presently, each type of Entity is assigned a MediaWiki namespace, and all pages
in that namespace have the corresponding content type.


Now, there are various ways to model redirects. I have tried a few:

1) first I tried implementing redirects as a "mode" any Entity may have. The
advantage is that redirects can seamlessly occur anywhere "real" entities can
occur, e.g. in dumps, diffs, undo operations, etc. However, many operations that
are defined on "real" entities are not defined on redirects (e.g. "set label",
etc), so we would end up with a lot if "if redirect, do this, else, do that"
checks all over the code base.

2) second I tried implementing redirects as a "mode" any EntityContent may have,
following the idea that redirects are really a MediaWiki concept, and should be
implemented outside the Wikibase data model. This still requires a lot of "if
redirect, do this, else, do that" checks, but only in places dealing with
EntityContent, not all places dealing with Entities.

3) third I tried using a separate entity type for representing redirects: a
RedirectContent points to an EntityContent, there is no "mode", no need for
extra checks, no chance for confusion. However, we break a few basic
assumptions, most importantly the assumption that all pages in an entity
namespace contain an EntityContent of the respective type.

None of these solutions seem satisfactory for the indicated reasons, and also
some additional consideration explained below.

4) This lead me to come fully cycle and consider another option: make redirects
a special *type* of Entity, besides Item and Property. This would again allow
straight forward diffs (and thus undo-operations) between the redirected and the
previous, un-redirected version of a page; Also, it would not compromise code
that operates on Item on Property objects. But since there are still many
operations defined for Entity that make no sense for redirects, we would need to
insert another class into the hierarchy (let's call it LabeledEntity) which
would be a base class for Item and Property, but not for Redirect.

This would require quite a bit of refactoring, and would introduce redirects as
a "first order citizen" into the Wikibase data model.


Which of the four options do you prefer? Or can you think of another, better,
option?


Below are some more points to consider when deciding on a design:


In a addition to the OO design questions outlined above, one important question
is how to handle redirects in the wb_entity_per_page table; that table contains
a 1:1 mapping of entity_id <-> page_id. It is used, among other things, for
listing all entities and for checking whether an entity exists. HOw should
redirects be represented in that table? They could be:

a) omitted (making it hard to list redirects),
b) or use the redirect's page ID (maintaining the 1:1 nature, but pointing to a
page that actually does not contain an entity),
c) or use the target entities page ID (breaking the 1:1 nature of the table, but
associating the ID with the accurate place).

For b) and c), a new column would be needed to mark redirects as such.

When deciding on how to represent redirects internally, we need to consider how
they should be handled externally. Of course, when asking for an entity that got
redirected, we should get the target entity's content, be it using the "regular"
HTML page interface, the API, the linked data interface, etc.

In RDF dumps, we would probably want redirects to show up as owl:sameAs
relations (or something similar that allows one URI to be indicated as the
primary one). Should redirects also be present in JSON dumps? If so, should it
also be possible to retrieve the JSON representation of entities from the API,
in places where usually "real" entities would be?


-- daniel
Comment 13 Bene* 2013-12-21 15:27:52 UTC
I have created a page based on Daniel's mail. Just adding the link here, too.

https://meta.wikimedia.org/wiki/Wikidata/Development/Entity_redirect_after_merge
Comment 14 tobias.gritschacher 2014-01-07 08:49:24 UTC
Since the decision was made (according to the document linked in comment #13) to implement option 2, I am closing this bug.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links