Last modified: 2012-10-30 18:38:11 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T32262, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 30262 - <p> tags are inserted between transcluded pages, if they contain images
<p> tags are inserted between transcluded pages, if they contain images
Status: NEW
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
https://pl.wikisource.org/wiki/Encykl...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-06 18:15 UTC by Beau
Modified: 2012-10-30 18:38 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Screenshot of a rendered page (239.50 KB, image/png)
2011-08-06 18:15 UTC, Beau
Details
Simple patch replacing \n with &#32; (333 bytes, patch)
2011-08-20 09:19 UTC, Beau
Details

Description Beau 2011-08-06 18:15:32 UTC
Created attachment 8891 [details]
Screenshot of a rendered page

Let's assume:
- Page 1 contains only text (no paragraphs)
- Page 2 contains only text (no paragraphs)
- Page 3 contains only text (no paragraphs), in the middle of text there was inserted image.

When transcluding those pages using <pages> tag, MediaWiki inserts between pages <p> tag, which incorretly divides text.

In the URL field there is address to sample page on pl.wikisource, which demonstrates the issue. I have also attached screenshot.

I don't know what causes parser to change the way text is rendered, when there is an image in the text. 

I think there should not be \n inserted between pages, which is related to bug #27637 closed as INVALID. 

Now inserting any images makes overall quality of the document lower instead of higher.
Comment 1 Ankry 2011-08-06 19:40:10 UTC
Similar testcase (http://pl.wikisource.org/wiki/Wikiskryba:Ankry/brudnopis0):
 
- Line 1 contains only text
- Line 2 contains text and the middle of text there are images inserted
- Line 3 contains only text

There is no empty line between. However MediaWiki places EACH line between <p> and </p>. IMO it is in contradiction with wiki rules (where an empty line between text lines means a new patragraph).
Comment 2 Philippe Elie 2011-08-09 23:13:40 UTC
This is a parser bug not a proofread extension bug as shown in the second example. I first thought it'll possible to workaround it in proofread extension by inserting a space instead of linefeed between page, but it breaks code where the last page end with a linefeed by protecting it from removal with an empty template. The generated code is in this case "\n<space>first line on the next page" : which mediawiki handle as a <nowiki><pre>first line</pre></nowiki>
Comment 3 Beau 2011-08-20 09:11:13 UTC
You can use space equivalent: &#32;
Comment 4 Beau 2011-08-20 09:19:46 UTC
Created attachment 8946 [details]
Simple patch replacing \n with &#32;
Comment 5 Philippe Elie 2011-08-21 07:47:17 UTC
Patch tested on my local wiki, it works with the {{nop}} template on en.ws which was broken by using a simple space instead of the proposed &amp;#32; Beside that, can someone ping a parser maintainer as comment 2 show it's a parser bug.
Comment 6 Beau 2011-08-23 22:20:53 UTC
The image thumb is created using <div> (block element), so it cannot be put inside <p> (inline element). The parser closes the opened paragraph, inserts <div> and then reopens paragraph.
Comment 7 John Du Hart 2011-08-23 23:49:28 UTC
Beau I applied your patch to ProofreadPage as a workaround however I think we're, all in agreement that this is a parser issue. I'll update the bug to reflect that and mark your patches as obsolete.
Comment 8 Philippe Elie 2011-10-06 16:37:21 UTC
I didn't think enough about the side effect of this patch. This patch would be reverted from trunk, first it doesn't solve the problem described and actually it is thought as a noop but it is not. If you start a Page: with a LF you'll get an expected <p> from the parser, but when transclusing with the <pages command this LF will only terminate the last line of the previous page so we'll not get a <p> at the page boundary, this mean there is no clean way to get the same html by looking a Page: or after transclusing two pages, the second page starting with a LF. It's a bit odd but a LF between page transclusion is more neutral than any other character.
Comment 9 Philippe Elie 2011-10-06 17:58:13 UTC
(In reply to comment #8)

My bad, first part of comment 8 is right, this patch doesn't fix anything, but the following rationale is wrong for reverting its wrong. Before the parser is called the generated code by the extension is <span>\n{{:MediaWiki:Proofreadpage_pagenum_template|page=Page:.......}}</span>{{:Page:....djvu/97}}
the span between page transclusion means a LF at start of a Page: can't be combined with the inserted LF by the extension to produce a <p>

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links