Last modified: 2014-09-18 06:26:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T64856, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 62856 - ContentHandler / #REDIRECT missing when using $content->getParserOutput( ...)
ContentHandler / #REDIRECT missing when using $content->getParserOutput( ...)
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Semantic MediaWiki (Other open bugs)
unspecified
All All
: Highest blocker (vote)
: MW 1.23 version
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-19 23:15 UTC by MWJames
Modified: 2014-09-18 06:26 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description MWJames 2014-03-19 23:15:35 UTC
During our SMW\Test\SimplePageRedirectRegressionTest::testDataImport MW 1.23alpha (5b8215c) failed due to #REDIRECT being missing from the $parserOutput->
getText() when using $content->getParserOutput( ... ).

The test passes on MW versions that don't use the ContentHandler and instead use $this->parser->parse( ... ).

Code used in MW 1.23

$revision = Revision::newFromTitle( $this->getTitle(), false, Revision::READ_NORMAL );

$content = $revision->getContent( Revision::RAW );

$this->parserOutput = $content->getParserOutput(
	$this->getTitle(),
	$revision->getId(),
	null,
	true
);

### Test output using var_dump for $content->getParserOutput( ...)
3) SMW\Test\SimplePageRedirectRegressionTest::testDataImport
This test printed output: string(18) "[[Has type::Page]]"
string(18) "[[Has type::Page]]"
string(33) "[[Category:Simple redirect test]]"
string(33) "[[Category:Simple redirect test]]"
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(33) "[[Category:Simple redirect test]]"
string(40) "Content of NewPageRedirectRegressionTest"
string(40) "Content of NewPageRedirectRegressionTest"
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "

Code used prior the ContentHandler

$revision = Revision::newFromTitle( $this->getTitle() );

$this->parserOutput = $this->parser->parse(
	$revision->getText(),
	$this->getTitle(),
	$this->makeParserOptions(),
	true,
	true,
	$revision->getID()
);

### Test output using var_dump using $this->parser->parse( ... )
3) SMW\Test\SimplePageRedirectRegressionTest::testDataImport
This test printed output: string(18) "[[Has type::Page]]"
string(18) "[[Has type::Page]]"
string(33) "[[Category:Simple redirect test]]"
string(33) "[[Category:Simple redirect test]]"
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(80) "#REDIRECT [[SimplePageRedirectRegressionTest]] [[Category:Simple redirect test]]"
string(40) "Content of NewPageRedirectRegressionTest"
string(40) "Content of NewPageRedirectRegressionTest"
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "
string(141) "This is part of the [[PageRedirectRegressionTest]] [[Category:Regression test]] [[Category:Redirect test]] [[Category:Simple redirect test]] "

"#REDIRECT [[SimplePageRedirectRegressionTest]] is missing from the ContentHandler generated ParserOuptut text object.
Comment 1 MWJames 2014-03-19 23:40:43 UTC
As noted on [0] (and confirmed by manual testing), the ContentHandler implementation works for versions other than MW 1.23.

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/212
Comment 2 Umherirrender 2014-03-20 18:21:01 UTC
A last change there is Gerrit change #105829, but I have no idea, if that is related.
Comment 3 Brad Jorsch 2014-03-20 18:58:43 UTC
This does not seem to be an issue in ContentHandler, as the following produces expected output (using 1.23wmf17 as deployed on enwiki):

 $title = Title::newFromText("Foo");
 $revision = Revision::newFromTitle( $title, false, Revision::READ_NORMAL );
 $content = $revision->getContent( Revision::RAW );
 $parserOutput = $content->getParserOutput( $title, $revision->getId(), null, true );
 echo $parserOutput->getText();

When var_dumping the $parserOutput object, I see nothing that remotely resembles the output that you are claiming your test returns (no field in the object contains wikitext), so I further doubt this has anything to do with MediaWiki core.

I don't see "SMW\Test\SimplePageRedirectRegressionTest" anywhere, so I can't comment on what it or the code it is testing might be doing. Although if Gerrit change #105829 is related, I can guess that you may be using some hook inside $wgParser->parse() to try to grab the wikitext being parsed for use later.

Absent further details, this bug should probably be reassigned to the appropriate extension.
Comment 4 Daniel Kinzler 2014-03-27 07:39:44 UTC
Note that getParserOutput()->getText() will return the rendered HTML of the page. 

There is no reason we should assume or insist that for a redirect, it will contain #REDIRECT. In fact, it would be perfectly sensible for it to return the HTML you are seeing when viewing a redirect with redirect=no: basically, a link to the redirect target, and an icon. 

Also, for content models other than wikitext, redirects my be defined in some other way; #REDIRECT is used for wikitext only.

Marking as invalid since the expectation that rendered HTML contain #REDIRECT is unfounded.
Comment 5 Brad Jorsch 2014-03-27 14:01:52 UTC
(In reply to Daniel Kinzler from comment #4)
> Marking as invalid since the expectation that rendered HTML contain
> #REDIRECT is unfounded.

I don't think they're expecting the rendered HTML to contain "#REDIRECT". Looking at the quoted unit test results, they're apparently getting wikitext (not HTML) from somewhere. And since there's nothing in core that allows for getting the wikitext back from a ParserOutput, it must be something in their extension.

But absent further details, RESOLVED INVALID if it stays assigned as-is works for me. Or reopen and reassign to SMW or whatever extension is involved here.
Comment 6 Mark A. Hershberger 2014-05-02 14:39:06 UTC
This is a blocker for the release of 1.23.  I'll dig into this a little and see what I can find.
Comment 7 Brad Jorsch 2014-05-02 14:53:31 UTC
The bug is in SMW, not core. But we don't seem to have an SMW component in Bugzilla anymore, so I guess "MediaWiki extensions / [other]" it is.
Comment 8 MWJames 2014-05-02 14:56:53 UTC
> The bug is in SMW, not core.

And we are quick in blaming others.

For those who are familiar with unit testing[0], I wrote a unit test that should even satisfy WMF employees to see that something doesn't match up when running the test on MW 1.23 and to avoid being blamed again for insufficient details.

I tried to write the unit test in a manner that should be expressive enough to compare results among standard text and text that includes "#REDIRECT".

If you run the test on MW 1.12/MW 1.22 all test passes while on MW 1.23/MW 1.24 it will fail.

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/295/files
Comment 9 Andre Klapper 2014-05-02 15:03:43 UTC
(In reply to Brad Jorsch from comment #7)
> The bug is in SMW, not core. But we don't seem to have an SMW component in
> Bugzilla anymore, so I guess "MediaWiki extensions / [other]" it is.

That's due to the will of SMW developers- see bug 62114. They prefer GitHub but as nobody can easily find out due to outdated bugtracker links I'll move this to the (otherwise closed) component.
Comment 10 MWJames 2014-05-02 15:08:46 UTC
> That's due to the will of SMW developers- see bug 62114

This bug has nothing to do with SMW. 

The test is SMW independent and relies solely on MW provided functionality to verify that the ContentHandler has an issue with #REDIRECT and InternalParseBeforeLinks on MW 1.23/MW1.24 while passing on MW 1.21/MW 1.22.

In comparison, the test passes for both Parser & ContentHandler on all MW versions when using a "normal" text (e.g '[[Lorem ipsum]] dolor sit amet ...').

You can run the test [0] in a normal Jenkins or local phpunit environment.

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/295/files
Comment 11 Brad Jorsch 2014-05-02 15:12:22 UTC
(In reply to MWJames from comment #8)
> > The bug is in SMW, not core.
> 
> And we are quick in blaming others.

Please explain why SMW is expecting the #REDIRECT line to be passed to the parser and breaking when it is not. As was asked back in March.

Chances are that whatever SMW code is breaking should be doing whatever it does in a different way. For example, perhaps it should be looking at the redirect table or checking the Content object's getRedirectTarget(). But without knowing what your code is actually trying to do, I can't do more than guess.

> If you run the test on MW 1.12/MW 1.22 all test passes while on MW 1.23/MW
> 1.24 it will fail.

Just because you can write a unit test that tests for some behavior doesn't mean the behavior was actually correct.
Comment 12 MWJames 2014-05-02 15:33:14 UTC
> Just because you can write a unit test that tests for some behavior doesn't mean the behavior was actually correct.

Now here is the thing, the test does not make any assumption about SMW or any other extension, it solely tests behaviour (if you would have looked at the test you would recognize the difference in behaviour the test asserts) that is expected to work.

Because when a standard text is involved the expected text is delivered to the InternalParseBeforeLinks hook in either case (on any MW version) but when something like '#REDIRECT ...' is involved the InternalParseBeforeLinks behaves differently.

> Chances are that whatever SMW code is breaking should be doing whatever it does in a different way. For example, perhaps it should be looking at the redirect

It is right, the difference in behaviour came to light during SMW testing on MW 1.23 but this does not implicate that it is SMW's fault just because you find this convenient.
Comment 13 Brad Jorsch 2014-05-02 16:56:34 UTC
MWJames, my point is that SMW was depending on buggy behavior, and just because you can write a unit test showing that the bug was fixed in 1.23 doesn't mean we should unfix the bug.

If you continue to just blindly insist that you're right and everyone else is wrong, we may as well close this bug as WONTFIX (or reassign it to MediaWiki core and mark it INVALID again).

Or we could move forward by you telling us what exactly SMW is doing by looking for the #REDIRECT in the wikitext passed to the parser and why it can't look in the redirect table or Content::getRedirectTarget() to fetch this information instead. As has been requested of you several times now.
Comment 14 MWJames 2014-05-02 17:48:04 UTC
> just because you can write a unit test showing that the bug was fixed in 1.23 doesn't mean we should unfix the bug.

Let me explain what the test does before you claim that the test is wrong.

When I have a page with '[[Lorem ipsum]] dolor sit amet ...' text and try to generate a ParserOutput object using either $this->parser->parse( ... ) or $content->getParserOutput( ... ) at some point the InternalParseBeforeLinks will be called and the hook interface ( &$parser, &$text ) where $text will contain '[[Lorem ipsum]] dolor sit amet ...' (for both cases ).

Now, when I have a page with '#REDIRECT [[...]]' text and try to generate a ParserOutput object using $this->parser->parse( ... ) where at some point the InternalParseBeforeLinks will be called, the hook interface (&$parser, &$text) with $text will contain '#REDIRECT [[...]]' but when $content->getParserOutput( ... ) is used the $text is empty (which should contain '#REDIRECT [[...]]').

The in behaviour is tested for a standard text and a text containing '#REDIRECT ...'.

Of course, if you are telling me that this is the expected behaviour then naturally you can argue that MW 1.21/MW 1.22 were wrong in the first place and that MW 1.23 has been fixed.

> If you continue to just blindly insist that you're right and everyone else is wrong

Interesting, I'm not insisting on anything the only thing I'm asserting (actually the test is asserting) is a difference in behaviour for InternalParseBeforeLinks hook on the usage of $this->parser->parse(...) to $content->getParserOutput( ... ) for when #REDIRECT is involved nothing more and nothing less.

> we may as well close this bug as WONTFIX (or reassign it to MediaWiki core and mark it INVALID again).

If this is the WMF approach nothing left for me to do.

> by you telling us what exactly SMW is doing by looking for the #REDIRECT in the wikitext passed to the parser and why it can't look in the redirect table or Content::getRedirectTarget()

As for SMW, InternalParseBeforeLinks is used to create annotations.

If the text component delivered by the InternalParseBeforeLinks hook carries a "#REDIRECT target" (which works of course if we use  $this->parser->parse(...) and worked for $content->getParserOutput( .. ) on MW 1.21/MW 1.22 ) then something like 
ContentHandler::makeContent( $text, null, CONTENT_MODEL_WIKITEXT )->getRedirectTarget() or Title::newFromRedirect( $text ) will work as expected.

The problem is that text component retrieved through the InternalParseBeforeLinks hook no longer carries the "#REDIRECT  ...".
Comment 15 Brad Jorsch 2014-05-02 18:23:10 UTC
(In reply to MWJames from comment #14)
> Of course, if you are telling me that this is the expected behaviour then
> naturally you can argue that MW 1.21/MW 1.22 were wrong in the first place
> and that MW 1.23 has been fixed.

Yes, that's exactly what I'm telling you.

> As for SMW, InternalParseBeforeLinks is used to create annotations.
> 
> If the text component delivered by the InternalParseBeforeLinks hook carries
> a "#REDIRECT target" (which works of course if we use 
> $this->parser->parse(...) and worked for $content->getParserOutput( .. ) on
> MW 1.21/MW 1.22 ) then something like 
> ContentHandler::makeContent( $text, null, CONTENT_MODEL_WIKITEXT
> )->getRedirectTarget() or Title::newFromRedirect( $text ) will work as
> expected.
> 
> The problem is that text component retrieved through the
> InternalParseBeforeLinks hook no longer carries the "#REDIRECT  ...".

I see, SMW has its own wikitext parser. That seems fragile.

I still believe that since the #REDIRECT line shouldn't be being parsed it shouldn't be included in the wikitext passed to the parser; if it were being passed to Parser::parse(), that function should still be stripping it off before it gets to the point of calling InternalParseBeforeLinks. It would make more sense to me that the redirect target be included as metadata somehow, which probably means storing it in the ParserOptions. Would that work for you?
Comment 16 Mark A. Hershberger 2014-05-02 18:49:17 UTC
(In reply to Brad Jorsch from comment #13)
> MWJames, my point is that SMW was depending on buggy behavior, and just
> because you can write a unit test showing that the bug was fixed in 1.23
> doesn't mean we should unfix the bug.

From my point of view as a part of release management, it appears that SMW depends on some undocumented behavior of MW.  It looks like Brad was unaware of SMW's dependency on this behavior, saw the bug and decided to fix it.

Keeping this "fix" will hurt around 8.5% of MediaWiki installations[1] while only abou 30 or 40 people have complained about the old behavior in the past.  In the meantime, they've been able to cope with that behavior.

It would be good if the SMW devs could go back and look at the previous bugs claiming to be fixed here and see if the arguments presented there show that fix is merited despite their concerns.

In the meantime, I'm slating https://gerrit.wikimedia.org/r/#/c/105829/ for a revert.  Could one of you verify that reverting that code fixes this problem?

[1] https://wikiapiary.com/wiki/Extension:Semantic_MediaWiki
Comment 17 Gerrit Notification Bot 2014-05-02 18:51:52 UTC
Change 131109 had a related patch set uploaded by MarkAHershberger:
This reverts commit d8b1b79ea423ef3391c34f63aa382fd6bad6597e.

https://gerrit.wikimedia.org/r/131109
Comment 18 Mark A. Hershberger 2014-05-02 18:59:37 UTC
(In reply to Brad Jorsch from comment #15)
> It would
> make more sense to me that the redirect target be included as metadata
> somehow, which probably means storing it in the ParserOptions. Would that
> work for you?

If this is what you want to do, I suggest you make this change in 1.24 and work with the SMW devs to get it implemented.
Comment 19 Jeroen De Dauw 2014-05-02 19:18:23 UTC
Thank you Mark for recognizing this as an API break, loose from the discussion if the change itself is good or bad.

I'm saddened that the amount of finger pointing here so often outweighs the amount of listening being done. In the end, we all lose because of this.
Comment 20 Brad Jorsch 2014-05-02 19:32:26 UTC
(In reply to Mark A. Hershberger from comment #18)
> If this is what you want to do, I suggest you make this change in 1.24 and
> work with the SMW devs to get it implemented.

I'm hoping that we can resolve this in time that you could still include it in 1.23, if MWJames likes the proposal.
Comment 21 Mark A. Hershberger 2014-05-02 19:46:08 UTC
(In reply to Brad Jorsch from comment #20)
> I'm hoping that we can resolve this in time that you could still include it
> in 1.23, if MWJames likes the proposal.

I understand your desire to do that, but we have to be more conservative with LTS releases.  This particular issue has only become apparent in the past week and we plan to make an LTS release in less than a month.
Comment 22 MWJames 2014-05-02 19:48:09 UTC
> In the meantime, I'm slating https://gerrit.wikimedia.org/r/#/c/105829/ for a revert.  Could one of you verify that reverting that code fixes this problem?

Thanks, finally I was able to track the relevant change that introduced the changed behaviour.

> Could one of you verify that reverting that code fixes this problem?

Yes, running the mentioned integration test passes again on MW 1.23 after reverting the change.

> it appears that SMW depends on some undocumented behavior of MW.

Whether documented or undocumented, it still doesn't explain the difference in behaviour of Parser::parse() and ContentHandler::getParserOutput() which at least should behave consistently. 

> I still believe that since the #REDIRECT line shouldn't be being parsed it shouldn't be included in the wikitext passed to the parser; if it were being passed to Parser::parse()

If this is the intend behaviour for Parser::parse() and ContentHandler::getParserOutput().

> It would make more sense to me that the redirect target be included as metadata somehow, which probably means storing it in the ParserOptions. Would that work for you?

If Parser::getOptions() by the time InternalParseBeforeLinks hook is called does contain information such as being a redirect page/has target page for the selected revision, I would see no problem to use that instead of doing ContentHandler::makeContent( $text, null, CONTENT_MODEL_WIKITEXT )->getRedirectTarget() or Title::newFromRedirect( $text ) manually during the InternalParseBeforeLinks hook.
Comment 23 Gerrit Notification Bot 2014-05-02 20:22:11 UTC
Change 131109 abandoned by MarkAHershberger:
This reverts commit d8b1b79ea423ef3391c34f63aa382fd6bad6597e.

Reason:
oops! You guys are right.  Moving to release branch.

https://gerrit.wikimedia.org/r/131109
Comment 24 Gerrit Notification Bot 2014-05-02 20:30:09 UTC
Change 131209 had a related patch set uploaded by MarkAHershberger:
This reverts commit d8b1b79ea423ef3391c34f63aa382fd6bad6597e.

https://gerrit.wikimedia.org/r/131209
Comment 25 Gerrit Notification Bot 2014-05-02 20:30:13 UTC
Change 131210 had a related patch set uploaded by Anomie:
Record redirect target in ParserOptions

https://gerrit.wikimedia.org/r/131210
Comment 26 Gerrit Notification Bot 2014-05-09 07:33:04 UTC
Change 131209 merged by jenkins-bot:
This reverts commit d8b1b79ea423ef3391c34f63aa382fd6bad6597e.

https://gerrit.wikimedia.org/r/131209
Comment 27 Mark A. Hershberger 2014-05-18 18:46:44 UTC
Closing since patch has been merged.
Comment 28 Gerrit Notification Bot 2014-09-18 06:26:20 UTC
Change 131210 merged by jenkins-bot:
Record redirect target in ParserOptions

https://gerrit.wikimedia.org/r/131210

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links