Last modified: 2012-08-06 19:25:42 UTC
As <pre> is a preprocessor tag, you can very easily mess up with it. Writing </pre> or even making links in preformatted paragraphs gives very ugly wikitext results. Please use the indentation syntax to build preformatted texts.
In the Wikitext output, we do use a single space indentation for pre-formatted blocks. I'm confused what you are filing this bug about.
Oops, I've mixed up wikitext and html output. However, typing "<pre>whatever</pre>" in a paragraph and applying a link to it will lead to * wikitext output "<pre>what[[ever|ever]]</pre>", which doesn't match the visual preview (on the left) * html/preview output "<p><pre>what<a href="/wiki/ever">ever</a></pre></p>", which is not what the current parser would render.
It is a problem in a to html serializer. <pre> tag should be escaped and handled just as a text - not as a html tag.
Yes, exactly. Both 2html and 2wikitext serializers should escape any preprocessor-handled tags, as well as template inclusions. It would be enough to map "<" to "<" and "{" to "{", maybe with a few rules to reduce it to the absolute essential.
*** Bug 33090 has been marked as a duplicate of this bug. ***
I think we should distinguish between "preprocessor syntax" escaping (this bug), and "wikisyntax" escaping (Bug 33090). Of course we could escape everything that looks the least bit of a parser instruction, but the output wouldn't be readable. But the conditions of what to escape when differ a lot between preprocessor and wiki syntax, especially as we already have a DOM of the latter.
(In reply to comment #6) > I think we should distinguish between "preprocessor syntax" escaping (this > bug), and "wikisyntax" escaping (Bug 33090). > Of course we could escape everything that looks the least bit of a parser > instruction, but the output wouldn't be readable. But the conditions of what to > escape when differ a lot between preprocessor and wiki syntax, especially as we > already have a DOM of the latter. Yeah but I don't really know how VisualEditor works... However if bug 33090 is resolved I guess this bug is resolved automatically. Maybe bug dependency?
Triage: I believe that currently entering wikitext will trigger it to be converted into the HTML equivalents and displayed appropriately on round-trip, but shouldn't.
The Parsoid serializer tokenizes all text content from the DOM and wraps all non-text tokens (any wiki or html syntax) into <nowiki> blocks.
@MZMcBride: Are we expected to repeat the component field in bug subject?
(In reply to comment #10) > @MZMcBride: Are we expected to repeat the component field in bug subject? Yes. The bug summary should be a short and succinct snippet that describes the bug. It may be a bit redundant, but including the component name (whether that's an extension, "MediaWiki core" or something else entirely) makes the bug summary vastly more informative and useful. In this case, "Escape wikitext tags written by hand" doesn't tell me what this bug is about. "Escape wikitext tags written by hand in VisualEditor interface" does tell me what this bug is about.
Mass-moving items into VisualEditor product
Mass-move out of "General" to "Data Model".
(In reply to comment #9) > The Parsoid serializer tokenizes all text content from the DOM and wraps all > non-text tokens (any wiki or html syntax) into <nowiki> blocks. One more thing is ampersands: <nowiki> doesn't work on them. http://www.mediawiki.org/w/index.php?title=Project:Sandbox&diff=554116&oldid=553897 This is rendered as "<"
(In reply to comment #14) > (In reply to comment #9) > > The Parsoid serializer tokenizes all text content from the DOM and wraps all > > non-text tokens (any wiki or html syntax) into <nowiki> blocks. > > One more thing is ampersands: <nowiki> doesn't work on them. > > http://www.mediawiki.org/w/index.php?title=Project:Sandbox&diff=554116&oldid=553897 > This is rendered as "<" This should be fixed with https://gerrit.wikimedia.org/r/#/c/12722/ once it is deployed. There are also a few other fixes to the wikitext escape algorithm in Parsoid that are now waiting for deployment.
(In reply to comment #15) > This should be fixed with https://gerrit.wikimedia.org/r/#/c/12722/ once it is > deployed. There are also a few other fixes to the wikitext escape algorithm in > Parsoid that are now waiting for deployment. Is it really helpful? It seems to handle "<", ">" only, without "&".
For now it only handles those two special cases since they are the most urgent. As described in the commit summary, the real fix will be to remove entity decoding in the tokenizer, and move it to a token stream transformer instead. This will give us 'html_entity' tokens which we can escape properly without having to escape all '&' characters that are not part of html entities. Until then it makes little sense to escape ampersands on plain text content since those ampersands that are actually part of entities are already decoded by the tokenizer at that stage, so would not be matched. We could instead pre-escape the input to the tokenizer, but that would then produce ugly wikitext for non-entity '&' characters.
HTML entities in plain text content entered in the VE are now escaped, while plain ampersands outside entities are not. HTML entities in wikitext are still decoded for display (as required), but round-tripped to their original form with a span wrapper. Closing as fixed, please reopen if there are still issues after the next Parsoid code update in the VE demo install, or at http://parsoid.wmflabs.org/_html/ (which we can actually update quickly).
Mass-moving bugs into the new 'Parsoid' product.