Last modified: 2012-11-09 18:43:00 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T37962, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 35962 - SMW: Crash in runJobs.php if _LEDT is enabled because Title::getLatestRevID returns zero for existing page
SMW: Crash in runJobs.php if _LEDT is enabled because Title::getLatestRevID r...
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Semantic MediaWiki (Other open bugs)
unspecified
All All
: Unprioritized normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-need-review
: 41249 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-13 22:24 UTC by Van de Bugger
Modified: 2012-11-09 18:43 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Workaround. (948 bytes, patch)
2012-04-13 22:24 UTC, Van de Bugger
Details

Description Van de Bugger 2012-04-13 22:24:45 UTC
Created attachment 10420 [details]
Workaround.

MediaWiki 1.18.1
SemanticMedia Wiki 1.7.1

I have never seen this issue when working with MediaWiki interactively, but running maintenance/runJobs.php script often crashes:

Fatal error: Call to a member function getUser() on a non-object in /var/www/oc.su/Extensions/SemanticMediaWiki-1.7.1/includes/SMW_ParseData.php on line 218

Call Stack:
    0.0004     668472   1. {main}() /var/www/oc.su/MediaWiki-1.18.1/maintenance/runJobs.php:0
    0.0029    1239944   2. require_once('/var/www/oc.su/MediaWiki-1.18.1/maintenance/doMaintenance.php') /var/www/oc.su/MediaWiki-1.18.1/maintenance/runJobs.php:108
    0.1163   18504696   3. RunJobs->execute() /var/www/oc.su/MediaWiki-1.18.1/maintenance/doMaintenance.php:105
    0.1813   19744368   4. RefreshLinksJob2->run() /var/www/oc.su/MediaWiki-1.18.1/maintenance/runJobs.php:78
    0.7043   39067088   5. LinksUpdate->__construct() /var/www/oc.su/MediaWiki-1.18.1/includes/job/RefreshLinksJob.php:119
    0.7048   39069280   6. wfRunHooks() /var/www/oc.su/MediaWiki-1.18.1/includes/LinksUpdate.php:98
    0.7048   39069280   7. Hooks::run() /var/www/oc.su/MediaWiki-1.18.1/includes/GlobalFunctions.php:3631
    0.7048   39084176   8. call_user_func_array() /var/www/oc.su/MediaWiki-1.18.1/includes/Hooks.php:216
    0.7048   39084512   9. SMWParseData::onLinksUpdateConstructed() /var/www/oc.su/MediaWiki-1.18.1/includes/Hooks.php:216
    0.7049   39084560  10. SMWParseData::storeData() /var/www/oc.su/Extensions/SemanticMediaWiki-1.7.1/includes/SMW_ParseData.php:481

(reported line numbers may be not fully correct because I added bunch of trace statements to the code to track down the problem.)

The problem appears in SMW_ParseData.php:210:

case '_LEDT' :
	$revision = Revision::newFromId( $title->getLatestRevID() );
	$user = User::newFromId( $revision->getUser() );
	$value = SMWDIWikiPage::newFromTitle( $user->getUserPage() );
	break;

For unknown reason, getLatestRevID() returns zero for valid title of *existing* page, $revision set to null, getUser() failed.

I am not sure if it is a bug in Title, LinkCache, BacklinkCache, RefreshLinksJob2 classes, or in runJobs.php script. 

Roughly, the scenario is:

runJobs.php calls RunJobs->execute() which calls RefreshLinksJob2->run() which calls BacklinkCache->getLinks(). The latter gets list of pages, but only titls, namespaces, and ids (no lastest rev id, no redirection, etc):

// @todo FIXME: Make this a function?
if ( !isset( $this->fullResultCache[$table] ) ) {
	wfDebug( __METHOD__ . ": from DB\n" );
	$res = $this->getDB()->select(
		array( $table, 'page' ),
		array( 'page_namespace', 'page_title', 'page_id' ), 
                // ===>>> NOTE: If I add 'page_latest' to the list, 
                // ===>>> the problem will disappear.
		$this->getConditions( $table ),
		__METHOD__,
		array(
			'STRAIGHT_JOIN',
			'ORDER BY' => $fromField,
		) );
	$this->fullResultCache[$table] = $res;
}

This information is cached in LinkCache, and later returned by $title->getLatestRevID(). I think it is obviously a bug (but do not know where exactly).

Meanwhile, Semantic MediaWiki can be fixed to workaround it:

case '_LEDT' :
	// Do *not* use
	//     $revision = Revision::newFromId( $title->getLatestRevID() );
	// being run from maintenance/runJobs.php it causes exceptions because 
	// `$title->getLatestRevID()' returns zero for *existing* page.
	// Not sure whether it is a MediaWiki bug or not, but using
	// `Revision::newFromTitle' helps to avid this problem.
	$revision = Revision::newFromTitle( $title );
	$user = User::newFromId( $revision->getUser() );
	$value = SMWDIWikiPage::newFromTitle( $user->getUserPage() );
	break;

Revision::newFromTitle() does not rely on cache, but gets last revision id directly from database. I should be less efficient, but works and does not crash.
Comment 1 Sumana Harihareswara 2012-04-21 18:44:58 UTC
Van de Bugger, now that the Semantic MediaWiki extension is hosted in Git and viewable via Gerrit, you can submit your patch directly into the source control system.  Simply get a developer access account

https://www.mediawiki.org/wiki/Developer_access

and then follow this guide:

https://www.mediawiki.org/wiki/Git/Workflow

The Gerrit project is "mediawiki/extensions/SemanticMediaWiki" and you can view its recent commit history at https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/extensions/SemanticMediaWiki.git;a=summary .
Comment 2 Markus Krötzsch 2012-11-01 17:34:18 UTC
*** Bug 41249 has been marked as a duplicate of this bug. ***
Comment 3 Markus Krötzsch 2012-11-01 17:44:19 UTC
Should be fixed by https://gerrit.wikimedia.org/r/#/c/31285/
Comment 4 Markus Krötzsch 2012-11-02 10:41:24 UTC
Change has been merged and issue should be fixed now.
Comment 5 Markus Krötzsch 2012-11-09 18:43:00 UTC
The underlying bug in MediaWiki seems to be Bug 37209 (recording this here for future reference). It is likely that this also leads to wrong data stored about other builtin page properties whenever a refreshJob is updating a page.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links