Last modified: 2014-10-08 05:42:34 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T71249, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 69249 - wfBaseConvert gives incorrect results and sometimes raises warning if using gmp and input starts with '0x' but base != 16 ('x' is valid in base 34-36)
wfBaseConvert gives incorrect results and sometimes raises warning if using g...
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
1.24rc
All Linux
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: hhvm, upstream
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-08-07 16:14 UTC by Fred Emmott
Modified: 2014-10-08 05:42 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Fred Emmott 2014-08-07 16:14:10 UTC
For example:

- converting '0x10' from base36 to base36 results in '000g' instead of '0x10'
- converting '0x1234567890abhaer' from base36 to base36 returns '000000000000000000, and raises this warning: "Warning: gmp_init(): Unable to convert variable to GMP - wrong type! in /home/fredemmott/test2.php on line 3"; presumably this is because 'h' isn't valid in an executable string

Tested with PHP 5.5.9 and HHVM 3.2.0 against Mediawiki master and 1.22wmf22

Isolated examples showing the cause:

<?php
$in = '0x1234567890abhaer';
$gmp = gmp_init($in, 36);
$val = gmp_strval($gmp, 36);
var_dump(str_pad($val, strlen($in), '0', STR_PAD_LEFT));

<?php
$in = '0x10';
$gmp = gmp_init($in, 36);
$val = gmp_strval($gmp, 36);
var_dump(str_pad($val, strlen($in), '0', STR_PAD_LEFT));

This issue was found by WfBaseConvertTest::testIdentity intermittently failing on HHVM's CI system since we added GMP support. Cause found by fuzzing:

<?php

function test() {
  $chars = '0123456789abcdefghijklmnopqrstuvwxyz';
  $base = mt_rand(2, 36);
  $len = mt_rand(10, 100);
  $str = '';
  for ($i = 0; $i < $len; ++$i) {
    $str .= $chars[mt_rand(0, $base - 1)];
  }
  $gmp = gmp_init($str, $base);
  $val = gmp_strval($str, $base);
  $val = str_pad($val, strlen($str), '0', STR_PAD_LEFT);
  if ($val !== $str) {
    throw new Exception('Not same - base '.$base."\n-".$str."\n+".$val);
  }
}

for($i = 0; $i < 100000; ++$i) {
  test();
}
Comment 1 Fred Emmott 2014-08-07 17:07:20 UTC
s/executable string/hexadecimal string/... oops.
Comment 2 Andre Klapper 2014-08-08 11:10:18 UTC
Thanks for taking the time to report this!

Note to myself: includes/GlobalFunctions.php in Core.

CC'ing Aaron as he's worked on wfBaseConvert before.
Comment 3 Fred Emmott 2014-08-08 15:02:12 UTC
Just to be clear re the hiphop keyword: this bug also exists if using PHP5 with the gmp extension.
Comment 4 Kevin Israel (PleaseStand) 2014-09-02 12:56:48 UTC
I think this is a case where gmp_init() is not behaving as documented on php.net. ltrim( $input, '0' ) should suffice as a workaround.

The documentation:

The base may vary from 2 to 36. If base is 0 (default value), the actual base is determined from the leading characters: if the first two characters are 0x or 0X, hexadecimal is assumed, otherwise if the first character is "0", octal is assumed, otherwise decimal is assumed.

[It doesn't say "if and only if", though there is no mention of autodetection when base is not 0.]

PHP 5.6, from convert_to_gmp() in ext/gmp/gmp.c:

if (Z_STRLEN_P(val) > 2) {
	if (numstr[0] == '0') {
		if (numstr[1] == 'x' || numstr[1] == 'X') {
			base = 16;
			skip_lead = 1;
		} else if (base != 16 && (numstr[1] == 'b' || numstr[1] == 'B')) {
			base = 2;
			skip_lead = 1;
		}
	}
}

[Why is this code even present when simply leaving it out, according to the GMP documentation <https://gmplib.org/manual/Assigning-Integers.html#Assigning-Integers>, would produce the documented behavior?]

HHVM, from variantToGMPData():

//Figure out what Base to use based on the ~*data*~
if (strLength > 1 && dataString[0] == '0') {
  if (strLength > 2) {
    if (dataString[1] == 'x' || dataString[1] == 'X') {
      base = 16;
      dataString = dataString.substr(2);
    } else if (base < 12 && (dataString[1] == 'b'
                             || dataString[1] == 'B')) {
      base = 2;
      dataString = dataString.substr(2);
    }
  }
  if (base == -1) {
    base = 8;
  }
} else if (strLength == 0) {
  dataString = s_gmp_0.get();
}

if (base == -1) {
  base = GMP_DEFAULT_BASE;
}

[Also note the use of -1 rather than 0 for autodetect.]
Comment 5 Kevin Israel (PleaseStand) 2014-09-02 13:11:42 UTC
This is a very old bug in PHP, likely caused by the following commit:

https://github.com/php/php-src/commit/cc065b33514a3ef6533de6b5406db954a5f4b9c4

Notice that the base == 0 check was lost. Then the committer noticed a problem and "fixed" it:

https://github.com/php/php-src/commit/6829710dceb0d39a06b7560a493a462e7504657b
Comment 7 Kevin Israel (PleaseStand) 2014-09-02 17:35:38 UTC
(In reply to Kevin Israel (PleaseStand) from comment #6)
> https://bugs.php.net/bug.php?id=50175
> https://bugs.php.net/bug.php?id=55398

Fixed for PHP 5.6.1. Just have to fix HHVM and add a workaround to MediaWiki for older PHP versions.
Comment 8 Gerrit Notification Bot 2014-09-02 20:02:46 UTC
Change 157870 had a related patch set uploaded by PleaseStand:
wfBaseConvert(): Work around PHP Bug #50175

https://gerrit.wikimedia.org/r/157870
Comment 9 Gerrit Notification Bot 2014-09-02 20:18:09 UTC
Change 157870 merged by jenkins-bot:
wfBaseConvert(): Work around PHP Bug #50175

https://gerrit.wikimedia.org/r/157870
Comment 10 Ori Livneh 2014-10-08 05:42:34 UTC
Kudos, Kevin.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links