Last modified: 2013-12-03 20:50:56 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59762, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 57762 - Older redirect code ( #REDIRECTE:[[厦门PX项目]] ) is not recognized anymore
Older redirect code ( #REDIRECTE:[[厦门PX项目]] ) is not recognized anymore
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Redirects (Other open bugs)
1.23.0
All All
: Lowest minor (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-29 23:22 UTC by Liangent
Modified: 2013-12-03 20:50 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Liangent 2013-11-29 23:22:42 UTC
I just found an old redirect at https://zh.wikipedia.org/w/index.php?title=2007%E5%8E%A6%E9%97%A8PX%E9%A1%B9%E7%9B%AE%E7%BC%93%E5%BB%BA&action=history which worked in the day of creation, but doesn't work anymore.

Redirect parsing code at that time (in commit 5cdc003c00c5b7dbc8395bc10ea067b1bbc19a44, Thu Jun 14 17:36:12 2007 +0000):

/**
 * Create a new Title for a redirect
 * @param string $text the redirect title text
 * @return Title the new object, or NULL if the text is not a
 *      valid redirect
 */
public static function newFromRedirect( $text ) {
        $mwRedir = MagicWord::get( 'redirect' );
        $rt = NULL;
        if ( $mwRedir->matchStart( $text ) ) {
                $m = array();
                if ( preg_match( '/\[{2}(.*?)(?:\||\]{2})/', $text, $m ) ) {
                        # categories are escaped using : for example one can enter:
                        # #REDIRECT [[:Category:Music]]. Need to remove it.
                        if ( substr($m[1],0,1) == ':') {
                                # We don't want to keep the ':'
                                $m[1] = substr( $m[1], 1 );
                        }

                        $rt = Title::newFromText( $m[1] );
                        # Disallow redirects to Special:Userlogout
                        if ( !is_null($rt) && $rt->isSpecial( 'Userlogout' ) ) {
                                $rt = NULL;
                        }
                }
        }
        return $rt;
}

I'm not sure when it was made more strict, but obviously it broke some old content and people didn't try to clean them up.
Comment 1 Bartosz Dziewoński 2013-11-29 23:35:50 UTC
The current code for this is in WikitextContent, and the regex does look stricter.

The regex itself was changed back in 2008, though, in r38737 by Roan (with a reference to bug 15053), and made a bit more relaxed (accepting the colon) in 2008 too in r38974 by Brion.

----

The current code for reference:

public function getRedirectTarget() {
  global $wgMaxRedirects;
  if ( $wgMaxRedirects < 1 ) {
    // redirects are disabled, so quit early
    return null;
  }
  $redir = MagicWord::get( 'redirect' );
  $text = trim( $this->getNativeData() );
  if ( $redir->matchStartAndRemove( $text ) ) {
    // Extract the first link and see if it's usable
    // Ensure that it really does come directly after #REDIRECT
    // Some older redirects included a colon, so don't freak about that!
    $m = array();
    if ( preg_match( '!^\s*:?\s*\[{2}(.*?)(?:\|.*?)?\]{2}!', $text, $m ) ) {
      // Strip preceding colon used to "escape" categories, etc.
      // and URL-decode links
      if ( strpos( $m[1], '%' ) !== false ) {
        // Match behavior of inline link parsing here;
        $m[1] = rawurldecode( ltrim( $m[1], ':' ) );
      }
      $title = Title::newFromText( $m[1] );
      // If the title is a redirect to bad special pages or is invalid, return null
      if ( !$title instanceof Title || !$title->isValidRedirectTarget() ) {
        return null;
      }

      return $title;
    }
  }

  return null;
}
Comment 2 Bartosz Dziewoński 2013-11-29 23:40:02 UTC
(In reply to comment #1)
> The regex itself was changed back in 2008, though, in r38737 by Roan (with a
> reference to bug 15053), and made a bit more relaxed (accepting the colon) in
> 2008 too in r38974 by Brion.

I'm also pretty sure this makes the bug a WONTFIX, unless you can come up with a better solution :) 

(CC-ing Roan and Brion)
Comment 3 Brion Vibber 2013-12-03 20:50:56 UTC
Per note above, this behavior has been consistent for 5 years so there's not really a great need to handle that misspelled case as back-compat. Resolving as wontfix; please feel free to fix up any similarly affected pages.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links