Last modified: 2014-11-03 16:16:36 UTC
Created attachment 12405 [details] screenshot Example: https://www.mediawiki.org/wiki/Communication
Yes, the Translate extension needs to be extended to support VisualEditor editing, but the core issue is that Parsoid doesn't recognise <translate> as an extension block; should it take the local list of registered extensions, and for ones it hasn't been told how to deal with just wrap it as a typeof="mw:Object/Unknown" or something similar?
Parsoid does fetch a list of registered extensions from the config and uses that to wrap the block with mw:Object/Extension/<ext-name-here> type. All other unknown tags are pass through as plain text. The problem is that mediawiki does not report <translate> as an installed extension. Check http://www.mediawiki.org/w/api.php?action=query&meta=siteinfo&format=jsonfm&siprop=extensiontags Any idea why mw.org is not reporting <translate> as a registered extension?
(In reply to comment #2) > Parsoid does fetch a list of registered extensions from the config and uses > that to wrap the block with mw:Object/Extension/<ext-name-here> type. All > other unknown tags are pass through as plain text. Ah, my apologies; I thought you were doing this, but it didn't even occur to me that the Translate extension might fail to do this. Re-filing.
<translate> and <tvar> are not a normal extension tags. Please explain why I should register them.
Parsoid needs to know if something is a valid extension tag so that it can process them rather than escape them to literal text. This will also let VE support them in the future and protect them from being modified right now. Is there any other way to discover that <translate>, <tvar>, <language> are valid tags on the wiki, i.e. what is the mediawiki api endpoint? So far we've been using siprop and fetching installed extension tags and use that to detect these.
*** Bug 52408 has been marked as a duplicate of this bug. ***
@Niklas are there issues with registering them?
I don't know. Either the registration would just be symbolic without any side effects, or it could also interfere with the current functionality.
Does someone want to try registering them and see what happens? I probably don't have time before Wikimania.
Ping.
Change 83427 had a related patch set uploaded by Nikerabbit: Register translate and tvar to the parser https://gerrit.wikimedia.org/r/83427
See the above commit. It is not thoroughly tested. I still don't understand what would the benefit.
Niklas: If VE has to support editing for <translate>, <tvar>, etc. tags, Parsoid has to know about those tags. Parsoid finds out about installed tags by querying mediawiki API. So, either MW API should expose this information for hooks automatically or we need a different API endpoint, or you have to register them.
In the long term yes, but wont this render VE useless on translatable pages in the short them?
(In reply to comment #14) > In the long term yes, but wont this render VE useless on translatable pages > in the short them? Yes. On MW.org you can use the alien node editor to edit any extension node's contents (but they won't be rich), but on other wikis it will make them un-editable in VE. However, making a Translation extension editor in VE shouldn't be hard at all - I've created that as bug 53974 along the lines of other bugs.
fwiw this bug has been mentioned at https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#VisualEditor_weekly_update_-_2013-09-26_.28MW_1.22wmf19.29 I have also seen this problem here and there while editing at mediawiki.org. Then again these tags are not used in most Wikimedia projects or MediaWikis there so fair enough.
Assigning to James, hoping this may get a parsoid person to review the patch that's been available for nearly a month now.
(In reply to comment #17) > Assigning to James, hoping this may get a parsoid person to review the patch > that's been available for nearly a month now. Oh. We've been patiently waiting for you to merge it. :-( Will ping the team.
I think I already did it yesterday.
(In reply to comment #19) > I think I already did it yesterday. There is a comment, but now a followup question requires addresssing
Change 83427 merged by jenkins-bot: Register translate and tvar to the parser https://gerrit.wikimedia.org/r/83427
I don't deserve the credit for Niklas's fix. :-)
That patch won't work well with Parsoid, see the comment in the patch set.
This issue appears to suffer from scope creep. The requirements of the current summary have been met. See https://translatewiki.net/wiki/Special:Version?uselang=en: Section "Parser extension tags" now shows "<translate> and <tvar>"
Misunderstanding or miscommunication about the actual requirements is not scope creep. The patch which was made does not add value. I changed the title to reflect the issue with my understanding of it.
The patch will work with Parsoid right now, because Parsoid always goes through the PHP parser when dealing with extensions and tag hooks. But, there are plans for Parsoid to bypass the PHP parser and call the extensions/hooks directly. Right now, The translate code preprocesses the page by stripping the <translate> and <tvar> tags before the PHP parser actually parses the page source (by registering for ParserBeforeStrip hook). So, even though the tag hooks are registered (with a callback that bombs if called), the PHP parser never gets to calling them. If Parsoid is able to mimic the behavior (by calling all hooks, not just extension callback hooks), the existing code will continue to work. However, if Parsoid only supports tag extensions directly, then, the ParserBeforeStrip code won't be invoked by Parsoid at that time and this will be a problem at that point. So, this is not an issue *right now*, but could be at a later point depending on what functionality Parsoid will implement natively and what it will continue to defer to the PHP parser. I think Gabriel was responding to that future concern and also hoping that we can use the opportunity of Parsoid's ongoing development to cleanup some of these interfaces and mechanisms. We could streamline extensions to go through narrow interfaces rather than continuing to use all sort of hooks into various points of the parsing timeline. Anything this is something we could discuss more. Hope this summarizes where we stand now.
Actually, now that I wrote that, I might have misspoken about whether it will work now as well. Niklas is right. For <translate> and <tvar> tags found in top-level pages, since we never go through PHP parser at all, this may still not work. Parsoid will recognize the hooks because of the registration. In turn, it will call the PHP parser to translate the extension content, which in turn may callback into the translate extension rather than the ParserBeforeStrip callback (which is what <translate> and <tvar> relies on), and that will bomb. One of us (Gabriel or me) will experiment with the updated code locally and update this bug.
Okay, we tested and all is good for now. You can ignore #c27 :-). As I indicated in #c26, this is a problem for when we start updating Parsoid code to call extensions directly rather than go through the full parse pipeline (which was meant to be a temporary hack while we figured out how to call extensions directly).
To summarize: * We currently work around the missing tag hook API by calling the full action=parse pipeline for each extension tag. This is quite expensive as you can imagine. * This lets the translate hack work for now. At least it won't crash. * The translate extension will crash once we actually call the registered tag hook. So there is some time to fix this up, but we should not pretend that it is fixed right now. Can you provide more information on what stops you from using the actual tag hook?
I just found out that if edit is made via API, the ParserBeforeStrip hook I am using does not get called and the exception is shown. Does anybody have an idea why?
Niklas: To me, it looks like an existing bug got exposed. Can you provide more details? Maybe full url of the API call? And/or, if you hop onto #mediawiki-parsoid, we could investigate.
Niklas: So, here is what I found: https://meta.wikimedia.org/w/api.php?action=parse&text=<translate>foo</translate> works and does not crash. https://meta.wikimedia.org/w/api.php?action=parse&text=<translate><!--T:1-->foo</translate> throws the exception. I am a bit baffled why the comment in the extension content should change what hook is called. I'll let you investigate from here. It might be worth using this opportunity to use a regular extension hook (rather than the ParserBeforeStrip hook), if that might work for you.
bump
Trying the parser hook solution now. I have code like: $parser->setHook( 'translate', function ( $input, $params, $parser, $frame ) { $re = '~<tvar\|([^>]+)>(.*?)</>~u'; $output = preg_replace( $re, '\2', $input ); $output = $parser->recursiveTagParse( $output, $frame ); $output = trim( $output ); return $output; } ); This seems to work okay on simple text, with the exception of section edit links. I'm still trying with more complex pages with templates as well extensions and our tutorial.
Change 91583 had a related patch set uploaded by Nikerabbit: Add $wgTranslatePageTranslationUseParserHook https://gerrit.wikimedia.org/r/91583
Issues I have found: * [MAJOR] Disabling edit section links does not work for headings inside <translate> tags * [NORMAL] Edit section links don't work. Not a regression, but would be nice fix. Above issue make this a bigger issue. * [BLOCKER] Table of contents does not include links * [MINOR] White space trimming behavior has changed. The original logic cannot be implemented with information provided for the parser hook. You can see the buildup at bottom of http://dev.translatewiki.net/wiki/Extension:Translate compare with http://www.mediawiki.org/wiki/Help:Extension:Translate . This is annoying and would be nice to fix, but the workaround is to edit page source. Documentation about whitespace before has to be update if this way is chosen. Please help me solve the remaining issues.
Also, please test the patch if you can to see if you find any other changes in behavior.
Editing translatable pages via API seems to work now.
(In reply to comment #36) > Issues I have found: > > * [MAJOR] Disabling edit section links does not work for headings inside > <translate> tags You could try marking your output as HTML according to https://www.mediawiki.org/wiki/Manual:Tag_extensions#How_can_I_avoid_modification_of_my_extension.27s_HTML_output.3F I'm not sure whether that suppresses section edit links too, but worth a try IMO. > * [NORMAL] Edit section links don't work. Not a regression, but would be nice > fix. Above issue make this a bigger issue. > * [BLOCKER] Table of contents does not include links Can you check whether your headings are matched by the formatHeadings regexp? Ideally we'd move both TOC and section edit links to JS / CSS. That is our plan for Parsoid output, but not going to happen over night.
(In reply to comment #39) > (In reply to comment #36) > > Issues I have found: > > > > * [MAJOR] Disabling edit section links does not work for headings inside > > <translate> tags > > You could try marking your output as HTML according to > https://www.mediawiki.org/wiki/Manual: > Tag_extensions#How_can_I_avoid_modification_of_my_extension.27s_HTML_output. > 3F This causes (lack of) whitespace issues and breaks all lists. > I'm not sure whether that suppresses section edit links too, but worth a try > IMO. It doesn't. > > > * [NORMAL] Edit section links don't work. Not a regression, but would be nice > > fix. Above issue make this a bigger issue. > > * [BLOCKER] Table of contents does not include links > > Can you check whether your headings are matched by the formatHeadings regexp? I'm assuming it processes the html I'm returning, which is: <h2><span class="mw-headline" id="Culture">Culture</span><mw:editsection page="Pupu" section="1">Culture</mw:editsection></h2> Not sure whether that matches or not. > > Ideally we'd move both TOC and section edit links to JS / CSS. That is our > plan > for Parsoid output, but not going to happen over night. Any changes for interim solutions?
Change 91583 abandoned by Siebrand: Add $wgTranslatePageTranslationUseParserHook Reason: Abandoned as there is no progress on this patch set. Can be re-activated if there is outlook on getting this in a mergable state. https://gerrit.wikimedia.org/r/91583
No patch any more. :-(
(In reply to James Forrester from comment #42) > No patch any more. :-( Hmm I'm not sure that's what "Can be re-activated if there is outlook on getting this in a mergable state" means, PATCH_TO_REVIEW seems a sensible way to tag such stale patches in need of TLC.
(In reply to Nemo from comment #43) > (In reply to James Forrester from comment #42) > > No patch any more. :-( > > Hmm I'm not sure that's what "Can be re-activated if there is outlook on > getting this in a mergable state" means, PATCH_TO_REVIEW seems a sensible > way to tag such stale patches in need of TLC. No. There is no patch to review for merge. There is patch to re-*do*.
To be clear: there is nothing wrong with that patch itself, it just doesn't work due to limitations in the parser itself. I cannot continue with this patch/bug until I get help from other people to modify parser/parsoid.