Last modified: 2013-04-08 12:45:42 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T41342, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 39342 - Special characters not handled properly in properties of datatype date (1.8 alpha)
Special characters not handled properly in properties of datatype date (1.8 a...
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Semantic MediaWiki (Other open bugs)
master
All All
: Unprioritized normal (vote)
: ---
Assigned To: Jeroen De Dauw
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-14 15:32 UTC by misdre
Modified: 2013-04-08 12:45 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description misdre 2012-08-14 15:32:52 UTC
On a French wiki, some French translation of months have trouble to be parsed. For instance, "janvier" (january) works well, as do the other months without any special character, but not "février" (february) or "décembre" (december). A small warning sign is displayed on the right of the date.

[[Date de sortie::21 mars 2012]] -> ok

[[Date de sortie::21 février 2011]] -> not ok

[[Date de sortie::21 fév 2011]] (short version) -> not ok

[[Date de sortie::21 août 2012]] -> not ok

This problem was already there in SMW 1.6 but it wasn't in a more distant past.
Comment 1 misdre 2013-03-29 03:08:41 UTC
I have found the culprit:

$matches = preg_split( "/([T]?[0-2]?[0-9]:[\:0-9]+[+\-]?[0-2]?[0-9\:]+|[a-z,A-Z]+|[0-9]+|[ ])/u", $parsevalue , -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY );

in includes/datavalues/SMW_DV_Time.php:202 (SMW 1.8.0.4).

For instance, with the date "8 décembre 1999" (december), $matches is:
Array
(
    [0] => 8
    [1] =>  
    [2] => d
    [3] => \xc3\xa9
    [4] => cembre
    [5] =>  
    [6] => 1999
)

It should be something like:
Array
(
    [0] => 8
    [1] =>  
    [2] => d\xc3\xa9cembre
    [3] =>  
    [4] => 1999
)
Comment 2 misdre 2013-03-29 03:49:34 UTC
I replaced [a-z,A-Z] by [\p{L}] to match any Unicode letter character (see <http://www.php.net/manual/en/regexp.reference.unicode.php>) and it works fine.

I think it's a harmless change. Any opinion?
Comment 3 Jeroen De Dauw 2013-04-03 21:12:44 UTC
That looks good, thanks for reporting, investigating and providing a patch :) I will try it out now.
Comment 4 Jeroen De Dauw 2013-04-03 21:21:15 UTC
https://gerrit.wikimedia.org/r/#/c/57422/

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links